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TITLE OF THE INVENTION ^ f 

HEPATITIS C VIRUS VACCINE 

RELATED APPLICATIONS 
5 The present application claims priority to provisional applications U.S.' 

Serial No. 60/363/774, filed March 13, 2002, arid U.S. Serial No. 60/328,655, filed ; , j 
October 1 1 , 2001 , each of which are hereby incorporated by reference herein, 

BACKGROUND OF THE INVENTION 

10 The references cited in the present application are not admitted to be 

prior art to the claimed invention. 

About 3% of the world's population are infected with the Hepatitis C 
virus (HCV). (Wasley et al, Semin. Liver Dis. 20, 1-16, 2000.) Exposure to HCV 
results in an overt acute disease in a small percentage of cases, while in most 

15 instances the virus establishes a chronic infection causing liver inflammation arid 

slowly progresses into liver failure and cirrhosis. (Iwarsori, FEMS Microbiol Rev. 14, 
201-204, 1994.) In addition, epidemiological surveys indicate an important role of ' 
* HCV in the pathogenesis of hepatocellular carcinoma. (Kew, FEMS Microbiol Rev. 
14, 211-220, 1994, Alter, Blood 85, 1681-1695, 1995.) 

20 Prior to the implementation of routine blood screening for HCV in 

1992, most infections were contracted by inadvertent exposure to contaminated blood, 
blood products of transplanted organs. In those areas where blood screening of HCV 
is carried out, HCV is primarily contracted through direct percutaneous exposure to 
infected blood, i.e., intravenous drug use. Less frequent methods of transrriission 

25 include perinatal exposure, hemodialysis, and sexual contact with an HCV infected 
persdri. (Alter et al., TV. Engl J. Med. 341(8), 556-562, 1999, Alter, /. Hepatol 31 , 
Suppl 88-91, 1999. Semin. Liver. Dis. 201, 1-16, 2000.) 

The HCV genome consists of a single strand RNA about 9.5 kb 
encoding a precursor polyprotein of about 3000 amind tabids. (Choo et aL. Science 

30 244, 362-364, 1989, Choo elal.. Science 244, 359-362, 1989. takamizawa e/ al, J\ ;.. 
Virol 65, 1 105-1 1 13, 1991.) The HCV polyprotein contains the viral proteins in the 
order C-El-E2-p7-NS2-NS3-NS4A-NS4B-NS5A-NS5B. - 

Individual viral proteins are produced by proteolysis of the HCV 
polyprotein. Host cell proteases release the putative structural proteins C, El , E2, and 
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p7, and create the N-terminus of NS2 at amino acid 810. (Mizushima et aL, J. Virol 
68, 273 1-2734, 1994, Hijikata et aL, P.N.A.S. USA 90, 10773-10777, 1993.) 

The non-structural proteins NS3, NS4A, NS4B, NS5A and NS5B 
presumably form the virus replication machinery and are released from the 
5 polyprotein. A zinc-dependent protease associated with NS2 and the N-terminus of 
NS3 is responsible for cleavage between NS2 and NS3. (Grakoui et al, /. Virol 67, 
1385-1395, 1993, Hijikata et al, P.NA.S. USA 90, 10773-10777, 1993.) A distinct 
serine protease located in the N-terminal domain of NS3 is responsible for proteolytic 
cleavages at the NS3/NS4A, NS4A/NS4B, NS4B/NS5A and NS5A/NS5B junctions. 
10 (Bartenschlager et aL, J. ViroL 67, 3835-3844, 1993, Grakoui et al., Proc. Natl Acad, 
ScL USA 90, 10583-10587, 1993, Tomei et aL, J. Virol 67, 4017-4026, 1993.) 
NS4A provides a cofactor for NS3 activity. (Failla et al., J. Virol 68, 3753-3760, 
1994, De Francesco et aL, U.S. Patent No. 5,739,002.) 

NS5 A is a highly phosphorylated protein conferring interferon 
15 resistance. (De Francesco et al, Semiru Liver Dis., 20(1), 69-83, 2000, Pawlotsky, 
Viral Hepat. SuppL 1, 47-48, 1999.) 

NS5B provides an RNA-dependent RNA polymerase. (De Francesco 
et aL, International Publication Number WO 96/37619, Behrens et al., EMBO 15, 12- 
22, 1996, Lohmann et al, Virology 249, 108-1 18, 1998.) 

20 

SUMMARY OF THE INVENTION 

The present invention features Ad6 vectors and a nucleic acid encoding 
a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide containing an inactive NS5B 
RNA-dependent RNA polymerase region. The nucleic acid is particularly useful as a 
25 component of an adenovector or DN A plasmid vaccine providing a broad range of 
antigens for generating an HCV specific cell mediated immune (CMI) response 
against HCV. 

A HCV specific CMI response refers to the production of cytotoxic T 
lymphocytes and T helper cells that recognize an HCV antigen. The CMI response 
30 may also include non-HCV specific immune effects. 

Preferred nucleic acids encode a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide that is substantially similar to SEQ. ID. NO. 1 and has sufficient protease 
activity to process itself to produce at least a polypeptide substantially similar to the 
NS5B region present in SEQ. ID. NO. 1. The produced polypeptide corresponding to 
35 NS5B is enzymatically inactive. More preferably, the HCV polypeptide has sufficient 
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protease activity to produce polypeptides substantially similar to the NS3,NS4A, ..... 
NS4B, NS5 A, and NS5B regions present in SEQ. ID. NO. 1. • ; : " ; ' 

Reference to a "substantially similar sequence" indicates an identity of at '.<,. . 
least about 65% to a reference sequence. Thus, for example, polypeptides having an amjrio; 
5' acid sequence substantially similar to SEQ. ID. NO. 1 have an overall amino acid identity 
of at least about 65% to SEQ. ID. NO. 1. 

Polypeptides corresponding to NS3, NS4A, NS4B, NS5A, and NS5B . ' : V ;. 
have an amino acid sequence identity of at least about 65% to the corresponding 
region in SEQ. ED. NO. L Such corresponding polypeptides are also referred to 
10 herein as NS3, NS4A, NS4B, NS5A, and NS5B polypeptides. 

Thus, a first aspect of the present invention describes a nucleic acid 
comprising a nucleotide sequence encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide substantially similar to SEQ. ID. NO. 1. The encoded polypeptide has * 
sufficient protease activity to process itself to produce an NS5B polypeptide that is 
15 enzymatically inactive. 

In a preferred embodiment, the nucleic acid is an expression vector 
capable of expressing the Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide in a 
desired human cell. Expression inside a human cell has therapeutic applications for 
actively treating an HCV infection and for prophylactically treating against an HCV, V 
20 infection. 

An expression vector contains a nucleotide sequence encoding a 
polypeptide along with regulatory elements for proper transcription and processing. • • 
The regulatory elements that may be present include those naturally associated with 
the nucleotide sequence encoding the polypeptide and exogenous regulatory elements . 
25 not naturally associated with the nucleotide sequence. Exogenous regulatory elements /> ; 
such as an exogenous promoter can be useful for expression in a particular host, such > 
as in a human cell. Examples of regulatory elements useful for functional expres$ion^; 
include a promoter, a terminator, a ribosome binding site, and a polyadenylation 
signal. 

30 Another aspect of the present invention describes a nucleic acid : 

comprising a gene expression cassette able to express in a human cell a Met-NS3- 
NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to SEQ. ID. NO. 1. The ,-, 
polypeptide can process itself to produce an enzymatically inactive NS5B protein. The 
gene expression cassette contains at least the following: 
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a) a promoter transcriptionally coupled to a nucleotide sequence 
encoding a polypeptide; 

b) a 5* ribosome binding site functionally coupled to the nucleotide 

sequence, 

5 c) a terminator joined to the 3' end of the nucleotide sequence, and 

d) a 3* polyadenylation signal functionally coupled to the nucleotide 

sequence. 

Reference to "transcriptionally coupled" indicates that the promoter is 
positioned such that transcription of the nucleotide sequence can be brought about by 
10 RNA polymerase binding at the promoter. Transcriptionally coupled does not require 
that the sequence being transcribed is adjacent to the promoter. 

Reference to "functionally coupled" indicates the ability to mediate an 
effect on the nucleotide sequence. Functionally coupled does not require that the 
coupled sequences be adjacent to each other. A 3' polyadenylation signal functionally 
15 coupled to the nucleotide sequence facilitates cleavage and polyadenylation of the 
transcribed RNA. A 5* ribosome binding site functionally coupled to the nucleotide 
sequence facilitates ribosome binding. 

In preferred embodiments the nucleic acid is a DNA plasmid vector or 
an adenovector suitable for either therapeutic application in treating HC V or as an 
20 intermediate in the production of a therapeutic vector. Treating HCV includes 
actively treating an HCV infection and prophylactically treating against an HCV 
infection. 

Another aspect of the present invention describes an adenovector 
comprising a Met-NS3~NS4A-NS4B-NS5A-NS5B expression cassette able to express 
25 a polypeptide substantially similar to SEQ. ID. NO. 1 that is produced by a process 
involving (a) homologous recombination and (b) adenovector rescue. The 
homologous recombinant step produces an adenovirus genome plasmid. The 
adenovector rescue step produces the adenovector from the adenogenome plasmid. 

Adenovirus genome plasmids described herein contain a recombinant 
30 adenovirus genome having a deletion in the El region and optionally in the E3 region 
and a gene expression cassette inserted into one of the deleted regions. The 
recombinant adenovirus genome is made of regions substantially similar to one or 
more adenovirus serotypes. 

Another aspect of the present invention describes an adenovector 
35 consisting of the nucleic acid sequence of SEQ. ID. NO. 4 or a derivative thereof, 
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wherein said derivative thereof has the HCV polyprotein encoding sequence preseikV \; 
in SEQ. ID. NO. 4 replaced with the HCV polyprotein encoding sequence of either 
SEQ. ID. NO. 3, SEQ. ID. NO. 10 or SEQ. ID. NO. 11. 

Another aspect of the present invention describes a cultured 
5 recombinant cell comprising a nucleic acid containing a sequence encoding a Mett 
NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to SEQ. ID. NO. 1. 
The recombinant cell has a variety of uses such as being used to replicate nucleic acid 
encoding the polypeptide in vector construction methods. 

Another aspect of the present invention describes a method of making * : 

10 an adenovector comprising a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette 
able to express a polypeptide substantially similar to SEQ. ID. NO. 1. The method 
involves the steps of (a) producing an adenovirus genome plasmid containing a 
recombinant adenovirus genome with deletions in the El and E3 regions and a gene 
expression cassette inserted into one of the deleted regions and (b) rescuing the 

15 adenovector from the adenovirus genome plasmid. 

Another aspect of the present invention describes a pharmaceutical 
composition comprising a vector for expressing a Met-NS3-NS4A-NS4B-NS5A- ' ■ 
NS5B polypeptide substantially similar to SEQ. ID. NO. 1 and a pharmaceutical^ 
acceptable carrier. The vector is suitable for administration and polypeptide 

20 expression in a patient . ; 

A "patient" refers to a mammal capable of being infected with HCV. 
A patient may or may hot be infected with HCV. Examples of patients are humans v 
arid chimpanzees. 

Another aspect of the present invention describes a method of treating, 

25 a patient comprising the step of administering to the patient ah effective amount of a", 
vector expressing a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially 
similar to SEQ. ID. NO. 1 . The vector is suitable for administration and polypeptide 
expression in the patient. 

The patient undergoing treatment may or may not be infected with 

30 HCV. F6r a patient infected with HCV, an effective amount is sufficient to achieve I 
one or more of the following effects: reduce the ability of HCV to replicate* reduce 
HCV load, increase viral clearance, and increase one or more HCV specific CMI \ x 7 
responses. For a patient not infected with HCV, an effective amount is sufficient to 
achieve one of more of the following: an increased ability to produce one qt more 

35 components of a HCV specific CMI response to a HCV infection, a reduced 
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susceptibility to HCV infection, and a reduced ability of the infecting virus to 
establish persistent infection for chronic disease. 

Another aspect of the present invention features a recombinant nucleic 
acid comprising an Ad6 region and a region not present in Ad6. Reference to 
5 "recombinant" nucleic acid indicates the presence of two or more nucleic acid regions 
not naturally associated with each other. Preferably, the Ad6 recombinant nucleic 
acid contains Ad6 regions and a gene expression cassette coding for a polypeptide 
heterologous to Ad6. 

Other features and advantages of the present invention are apparent 
10 from the additional descriptions provided herein including the different examples. 
The provided examples illustrate different components and methodology useful in 
practicing the present invention. The examples do not limit the claimed invention. 
Based on the present disclosure the skilled artisan can identify and employ other 
components and methodology useful for practicing the present invention. 

15 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A and IB illustrate SEQ. ID. NO. 1. 

Figures 2A, 2B, 2C, and 2D illustrate SEQ. ID. NO. 2. SEQ. ID. NO. 

2 provides a nucleotide sequence coding for SEQ. ID. NO. 1 along with an optimized 
20 internal ribosome entry site and TAAA termination. Nucleotides 1-6 provides an 

optimized internal ribosome entry site. Nucleotides 7-5961 code for a HCV Met- 
NS3-NS4A-NS4B-NS5A-NS5B polypeptide with nucleotides in positions 5137 to 
5145 providing a AlaAlaGly sequence in amino acid positions 1711 to 1713 that 
renders NS5B inactive. Nucleotides 5962-5965 provide a TAAA termination. 
25 Figures 3A, 3B, 3C, and 3D illustrate SEQ. ID. NO. 3. SEQ. ID. NO. 

3 is a codon optimized version of SEQ. ID. NO. 2. Nucleotides 7-5961 encode a 
HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. 

Figures 4A-4M illustrate MRKAd6-NSmut (SEQ. ID. NO. 4). SEQ. 
ID. NO. 4 is an adenovector containing an expression cassette where the polypeptide 

30 of SEQ. ID. NO. 1 is encoded by SEQ. ID. NO. 2. Base pairs 1-450 correspond to the 
Ad5 bp 1 to 450; base pairs 462 to 1252 correspond to the human CMV promoter, 
base pairs 1258 to 1267 correspond to the Kozak sequence; base pairs 1264 to 7222 
correspond to the NS genes; base pairs 7231 to 7451 correspond to the BGH 
polyadenylation signal; base pairs 7469 to 9506 correspond to Ad5 base pairs 3511 to 

35 5548; base pairs 9507 to 32121 correspond to Ad6 base pairs 5542 to 28156; base 
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pairs 32122 to 35117 correspond to Ad6 base pairs 30789 to 33784; and base pairs - , 
35118 to 37089 correspond to Ad5 base pairs 33967 to 35935. 

Figures 5 A-50 illustrate SEQ. ID. NOs. 5 and 6. SEQ. ID. NO. 5 
encodes a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide with an active 
5 RNA dependent RNA polymerase. SEQ. ID. NO. 6 provides the amino acid sequence- 
for the polypeptide. 

Figures 6A-6C provide the nucleic acid sequence for pVlJnsA (SEQ/ ., hff 

id. no. 7). "■ 

Figures 7A-7N provide the nucleic acid sequence for the Ad6 genome ; / 
10 (SEQ. ID. NO. 8). - 
Figures 8 A-8K provide the nucleic acid sequence for the Ad5 genome 

(SEQ. ID. NO. 9). "Aiv 
Figure 9 illustrates different regions of the Ad6 genome. The linear 

(35759 bp) ds DNA genome is indicated by two parallel lines and is divided into 100 . 
15 map units. Transcription units are shown relative to their position and orientation in ; . \ 

the genome. Early genes (El A, E1B, E2A/B* E3 and E4 are indicated by gray arrows. 

Late genes (LI to L5) , indicated by black arrows, are produced by alternative splicing -: : // it y 

of a transcript produced from the major late promoter (MLP) and all contain the * ; ' 

tripartite leader (1, 2, 3) at their 5' ends. The El region is located from approximately' A ' ; 
20 1.0 td-1 1.5 map units, the E2 region from 75.0 to 1 1.5 map units, E3 frdm,76.1 to 86.7- % 

map units, and E4 from 99.5 to 91.2 map units, The major late transcription unit is 

located between 16.0 and 91.2 map units. 

Figure 10 illustrates homologous recombination to recover pAdEl-E3+ V • 

containing Ad6 and Ad5 regions. - 
25 Figure 1 1 illustrates homologous recombinant to recover a pAdEl-E3+ : r 

containing Ad6 regions. . .. 

Figure 12 illustrates a western blot on whole-cell extracts from 293 4 

cells transfected with plasmid DNA expressing different HCV NS cassettes. Mature : : V ; ■ ? 

NS3 and NS5A products weife detected with specific antibodies. "pVUns-NS" refers 
30 to a pVl JnsA plasmid where a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide is \! ] u> 

encoded by SEQ. ID. NO. 5, and SEQ. ID. NO. 5 is inserted between biases 18$1 and 

1912 of SEQ. ID. NO. 7. "pVl Jns-NSmut" refers to a pVlJnsA plasmid where SEQ. 

ID, NO. 2 is inserted between bases 1882 and 1925 of SEQ. ID. NO. 7. "pVlJiis- ™ a 

NSOPTmut" refers to a pVlJnsA plasmid where SEQ. ID. NO. 3 is inserted between 
35 bases 1881 and 1905 of SEQ. ID. NO. 7. >V.--W 

7 



WO 03/031588 



PCT/US02/32512 



Figures 13A and 13B illustrate T cell responses by IFNy ELIspot 
induced in C57black6 mice (A) and BalbC mice (B) by two injections of 25^ig and 
50jig, respectively, of plasmid DNA encoding the different HCV NS cassettes with 
Gene Electro-Transfer (GET). 
5 Figure 14 illustrates protein expression from different adenovectors 

upon infection of HeLa cells. MRKAdS-NSmut is an adenovector based on an Ad5 
sequence (SEQ. ID. NO. 9), where the Ad5 genome has an El deletion of base pairs 
451 to 35 10, an E3 deletion of base pairs 28 1 34 to 308 17, and has the NS3-NS4A- 
NS4B-NS5A-NS5B expression cassette as provided in base pairs 451 to 7468 of SEQ. 
10 ID. NO. 4 inserted between positions 450 and 35 1 1. Ad5-NS is an adenovector based 
on an Ad5 backbone with an El deletion of base pairs 342 to 3523, and E3 deletion of 
base pairs 28134 to 30817 and containing an expression cassette encoding a NS3- 
NS4A-NS4B-NS5A-NS5B from SEQ. ID. NO. 5. "MRKAd6-NSOPTmut" refers to 
an adenovector having a modified SEQ. ID. NO. 4 sequence, wherein base pairs 1258 
15 to 7222 of SEQ. ID. NO. 4 is replaced with SEQ. ID. NO. 3. 

Figure 15 illustrates T cell responses by IFNy ELIspot induced in 
C57black6 mice by two injections of 10 9 vp of adenovectors containing different 
HCV non-structural gene cassettes. 

Figures 16A-16D illustrate T cell responses by IFNy ELIspot induced 
20 in Rhesus monkeys by one or two injections of 10 10 vp (A) or 10 11 vp (B) of, 
adenovectors containing different HCV non-structural gene cassettes. 

Figures 17 A and 17B illustrates CD8+ T cell responses by IFNy ICS 
induced in Rhesus monkeys by two injections of 10 10 vp (A) or 10 1 1 vp (B) of 
adenovectors encoding the different HCV non-structural gene cassettes. 
25 Fi gures 1 8 A- 1 8F illustrate T cell responses by bulk CTL assay induced 

in Rhesus monkeys by two injections of 10 u vp of Ad5-NS (A), MRKAdS-NSmut 
(B), or MRKAd6-NSmut (C). 

Figure 19 illustrates the plasmid pE2. 

Figures 20A-D illustrates the partial codon optimized sequence 
30 NSsuboptmut (SEQ. ID. NO. 10). Coding sequence for the Met-NS3-NS4A-NS4B- 
NS5A-NS5B polypeptide is from base 7 to 5961. 



8 



WO 03/031588 



PCT/US02/32512 



DETAILED DESCRIPTION OF THE INVENTION 

The present invention features Ad6 vectors and nucleic acid encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide that contains an inactive NS5B 
region. Providing an inactive NS5B region supplies NS5B antigens while reducing the 
5 possibility of adverse side effects due to an active viral RNA polymerase. Uses of the V 
featured nucleic acid include use as a vaccine component to introduce into a cell an 
HCV polypeptide that provides a broad range of antigens for generating a CMI 
response against HCV, and as an intermediate for producing such a vaccine 
component. 

10 The adaptive cellular immune response can function to recognize viral 

antigens in HCV infected cells throughout the body due to the ubiquitous distribution 
of major histocompatibility complex (MHC) class I and II expression, to induce 
immunological memory, and to maintain immunological memory. These functions/ "- 
are attributed to antigen-specific CD4+ T helper (Th) and CD8+ cytotoxic T cells 

15 (CTL). 

Upon activation via their specific T cell receptors, HCV specific Th 
celte fulfill a variety of immunoregulatory functions, mpst of them mediated by Thl 
and Th2 cytokines. HCV specific Th cells assist in the activation and differentiation- " 
of B cells and induction and stimulation of vims-specific cytotoxic T cells. Together 
20 with CTL, Th cells may also secrete IFN-y and TNF-a that inhibit replication arid ; ']'[ 
gene expression of several viruses. Additionally, Th cells and CTL, the main effectdr 
cells, can induce apoptosis and lysis of virus infected cells. 

HCV specific CTL are generated from antigens processed by 
professional antigen presenting cells (pAPCs). Antigens can be either synthesized 
25 within of introduced into pAPCs. Antigen synthesis in a pAPC can be brought about 
by introducing into the cell an expression cassette encoding the antigen. 

A preferred route of nucleic acid vaccine administration is an 
intramuscular route. Intramuscular administration appears to result in the introduction 
and expression of nucleic acid into somatic cells and pAPCs. HCV antigens produced 
30 in the somatic cells can be transferred to pAPCs for presentation in the context of '• 
MHC class I molecules. (Donnelly e taL, Annu. Rev. Immunol 75:617-648, 1997.) 

pAPCs process longer length antigens into smaller peptide antigens in 
the proteasome complex. The antigen is translocated into the endoplasmic 
reticulum/Golgi complex secretory pathway for association with MHC class I 
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proteins. CD8+ T lymphocytes recognize antigen associated with class I MHC via the 
T cell receptor (TCR) and the CD8 cell surface protein. 

Using a nucleic acid encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide as a vaccine component allows for production of a broad range of 
5 antigens capable of generating CNfl responses from a single vector. The polypeptide 
should be able to process itself sufficiently to produce at least a region corresponding 
to NS5B. Preferred nucleic acids encode an amino acid sequence substantially similar 
to SEQ. ID. NO. 1 that has sufficient protease activity to process itself to produce 
individual HCV polypeptides substantially similar to the NS3, NS4A, NS4B, NS5A, 

10 and NS5B regions present in SEQ. DD, NO. 1 . 

A polypeptide substantially similar to SEQ. ID. NO. 1 with sufficient 
protease activity to process itself in a cell provides the cell with T cell epitopes that 
are present in several different HCV strains. Protease activity is provided by NS3 and 
NS3/NS4A proteins digesting the Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide at 

15 the appropriate cleavage sites to release polypeptides corresponding to NS3, NS4A, 
NS4B, NS5A, and NS5B. Self- processing of the Met-NS3-NS4A-NS4B-NS5A- 
NS5B generates polypeptides that approximate naturally occurring HCV polypeptides. 

Based on the guidance provided herein a sufficiently strong immune 
response can be generated to achieve beneficial effects in a patient. The provided 

20 guidance includes information concerning HCV sequence selection, vector selection, 
vector production, combination treatment, and administration. 



I HCV SEQUENCES 
A variety of different nucleic acid sequences can be used as a vaccine 
25 component to supply a HCV Met-NS3-NS4A-NS4B-NS5 A-NS5B polypeptide to a 
cell or as an intermediate to produce vaccine components. The starting point for 
obtaining suitable nucleic acid sequences are preferably naturally occurring NS3- 
NS4A-NS4B-NS5A-NS5B polypeptide sequences modified to produce an inactive 
NS5B. 

30 The use of a HCV nucleic acid sequence providing HCV non-structural 

antigens to generate a CMI response is mentioned by Cho et aL, Vaccine 17:1 136- 
1 144, 1999, Paliard et al t International Publication Number WO 01/30812 (not 
admitted to be prior art to the claimed invention), and Coit et a/., International 
Publication Number WO 01/38360 (not admitted to be prior art to the claimed 

35 invention). Such references fail to describe, for example, a polypeptide that processes 
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itself to produce an inactive NS5B, and the particular combinations of HCV 
sequences and delivery vehicles employed herein. 

Modifications to a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide sequence can be produced by altering the encoding nucleic acid. 
5 Alterations can be performed to create deletions, insertions and substitutions. 

Small modifications can be made in NS5B to produce an inactive 
polymerase by targeting motifs essentially for replication. Examples of motifs critical 
for NS5B activity and modifications that can be made toproduce an inactive NS5B 
are described by Lohmann et a/., Journal of Virology 77:8416-8426, 1997, and 
10 Kolykhalove* al., Journal of Virology 74:2046-2051,2000. 

Additional factors to take into account when producing modifications 
to a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide include maintaining the , 
ability to self-process and maintaining T cell antigens. The ability of the HCV * ~ -* ; 
polypeptide to process itself is determined to a large extent by a functional NS3 
15 protease. Modifications that maintain NS3 activity protease activity can be obtained 
by taking into account the NS3 protein, NS4A which serves as a cofactor for NS3, and 
NS3 protease recognition sites present within the NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide. 

Different modifications can be made to naturally occurring NS3- 

20 NS4A-NS4B-NS5A-NS5B polypeptide sequences to produce polypeptides able to 
elicit a broad range of T cell responses. Factors influencing the ability of a . 
polypeptide to elicit a broad T cell response include the preservation or introduction 
of HCV specific T cell antigen regions and prevalence of different T cell antigen " M - 
tegions in different HCV isolates. 

25 Numerous examples of naturally occurring HCV isolates are Well 

known in the art HCV isolates can be classified into the following six major 
genotypes comprising one or more subtypes: HCV-l/(la,ib,lc), HCV-2/(2a,2b,2c), 
HCV-3/(3a,3b,10a), HCV-4/(4a), HCV-5/(5a) and HCV-6/(6a,6fr,7b,8b,9a,l la). v: [\ 
(Simmonds,/* Gen. Virol, 693-712,2001.) Examples of particular HCV sequences 

30 such as HCV-BK, HCV-J, HCV-N* HCV-H, have been deposited in GenBank and 
described in various publications. (See, for example, Chariiberlain et ah; J. GerL 
Virol, 1341-1347, 1997.) • 

HCV T cell antigens can be identified by, for example, empirical 
experimentation. One way of identifying T cell antigens involves generating a series 

35 of overlapping short peptides from a longer length polypeptide and then screening the 
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T-cell populations from infected patients for positive clones. Positive clones are 
activated/primed by a particular peptide. Techniques such as EFNy-ELISPOT, IFNy- 
Intracellular staining and bulk CTL assays can be used to measure peptide activity. 
Peptides thus identified can be considered to represent T-cell epitopes of the 
5 respective pathogen. 

HCV T cell antigen regions from different HCV isolates can be 
introduced into a single sequence by, for example, producing a hybrid NS3-NS4A- 
NS4B-NS5A-NS5B polypeptide containing regions from two or more naturally 
occurring sequences. Such a hybrid can contain additional modifications, which 
10 preferably do not reduce the ability of the polypeptide to produce an HCV CM! 
response. 

The ability of a modified Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide to process itself and produce a CMI response can be determined using 
techniques described herein or well known in the art Such techniques include the use 
15 of IFNy-ELISPOT, IFNY-Intracellular staining and bulk CTL assays to measure a 
HCV specific CMI response. 

A. Met-NS3-NS4A-NS4B-NS5A-NS5B Sequences 

SEQ. ID. NO. 1 provides a preferred Met-NS3-NS4A-NS4B-NS5A- 

20 NS5B sequence. SEQ. ID. NO. 1 contains a large number of HCV specific T cell 

antigens that are present in several different HCV isolates. SEQ. ID. NO. 1 is similar 
to the NS3-NS4A-NS4B-NS5A-NS5B portion of the HCV BK strain nucleotide 
sequence (GenBank accession number M58335). 

In SEQ. ID. NO. 1 anchor positions important for recognition by MHC 

25 class I molecules are conserved or represent conservative substitutions for 18 out of 
20 known T-cell epitopes in the NS3-NS4A-NS4B-NS5 A-NS5B portion of HCV 
polyproteins. With respect to the remaining two known T-cell epitopes, one has a 
non-conservative anchor substitution in SEQ. ID. NO. 1 that may still be recognized 
by a different HLA supertype and one epitope has one anchor residue not conserved. 

30 HCV T-cell epitopes are described in Chisari ex aL, Curr. Top, Microbiol Immunol, 
242:299-325, 2000, and Lechner et al. J. Exp. Med. 9:1499-1512, 2000. 

Differences between the HCV-BK NS3-NS4A-NS4B-NS5 A-NS5B 
nucleotide sequence and SEQ. ID. NO. 1 include the introduction of a methionine at 
the 5' end and the presence of modified NS5B active site residues in SEQ. ID. NO. L 
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The modification replaces GlyAspAsp with AlaAlaGly (residues 171 1-1713) to 
inactivate NS5B. 

The encoded HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide 
preferably has an amino acid sequence substantially similar to SEQ. ID. NO. 1. In . 
5 different embodiments, the encoded HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide has an amino acid identify to SEQ. ID. NO. 1 of at least 65%, at least 
75%, at least 85%, at least 95%, at least 99% or 100%; or differs from SEQ. ID. NO: ? 
1 by 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, i- 
17, 1-18, 1-19, or 1-20 amino acids. . . 

10 Amino acid differences between a Met-NS3-NS4A-NS4B-NS5A- 

NS5B polypeptide and SEQ. ID. NO. 1 are calculated by determining the minimum 
number of amino acid modifications in which the two sequences differ. Amino acid ; 
modifications can be deletions, additions, substitutions or any combination therebf. 

Amino acid sequence identity is determined by methods well known in 

15 the art that compare the amino acid sequence of one polypeptide to the amino acid 
sequence of a second polypeptide and generate a sequence alignment Amino acid 
identity is calculated from the alignment by counting the number of aligned residue, 
pairs that have identical amino acids. 

Methods for determining sequence identity include those described by 

20 Schuler, G.D. in Bioinformatics: A Practical Guide to the Analysis of Genes and 

Proteins, Baxevanis, A.D. and Ouelettej BJF.-R* eds*, John Wiley & Sons, inc, 2001; 
Yoria, et al, in Bioinformatics: Sequence, structure and databanks, Higgins, p. and '. 
Taylor, W. eds, Oxford University Press, 2000; and Bioinformatics; Sequence and* 
Genome Analysis, Mount, D.W., ed., Cold Spring Harbor Laboratory Press, 2001). 

25 Methods to determine amino acid sequence identity are codified in publicly available 
computer programs such as GAP (Wisconsin Package Version 10.2, Genetics 
Computer Group (GCG), Madison, Wise), BLAST (Altschul et al., 7. MoL Biol \ 
215(3) :403-10, 1990), and FASTA (Pearson, Methods in Enzytriology 753:63-98, 
1990, R.F. Doolittle, ed.). 

30 In an embodiment of the present invention sequence identity between*" 

two polypeptides is determined using the GAP program (Wisconsin Package Version 
10.2, Genetics Computer Group (GCG), Madison, Wise). GAP uses the alignment 
method of Needleman and Wunsch. (Needleman, et al. § /. MoL Biol. 45:443-453, 
1970.) GAP considers all possible alignments and gap positions between two 

35 sequences and creates a global alignment that maximizes the number of matched ' ■ 
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residues and minimizes the number and size of gaps. A scoring matrix is used to 
assign values for symbol matches. In addition, a gap creation penalty and a gap 
extension penalty are required to limit the insertion of gaps into the alignment 
Default program parameters for polypeptide comparisons using GAP are the 
5 BLOSUM62 (Henikoff etal., Proc. Natl Acad ScL USA, 89:10915-10919, 1992) 
amino acid scoring matrix (MATrix=blosum62.cmp), a gap creation parameter 
(GAPweight=8) and a gap extension pararameter (LENgthweight=2). 

Moie preferred HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptides in addition to being substantially similar to SEQ. ID. NO. 1 across their 

10 entire length produce individual NS3, NS4A, NS4B, NS5A and NS5B regions that are 
substantially similar to the corresponding regions present in SEQ. ID. NO. 1. The 
corresponding regions in SEQ. ID. NO. 1 are provided as follows: Met-NS3 amino 
acids 1-632; NS4A amino acids 633-686; NS4B amino acids 687-947; NS5A amino 
acids 948-1394; and NS5B amino acids 1395-1985. 

15 In different embodiments a NS3, NS4A, NS4B, NS5 A and/or NS5B 

region has an amino acid identity to the corresponding region in SEQ. ID. NO. .1 of at 
least 65%, at least 75%, at least 85%, at least 95%, at least 99%, or 100%; or an 
amino acid difference of 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 
1-14, 1-15, 1-16, 1-17, 1-18, 1-19, or 1-20 amino acids. 

20 Amino acid modifications to SEQ. ID. NO. 1 preferably maintain all or 

most of the T-cell antigen regions. Differences in naturally occurring amino acids are 
due to different amino acid side chains (R groups). An R group affects different 
properties of the amino acid such as physical size, charge, and hydrophobicity. 
Amino, acids can be divided into different groups as follows: neutral and hydrophobic 

25 (alanine, valine, leucine, isoleucine, proline, tyrptophan, phenylalanine, and 
methionine); neutral and polar (glycine, serine, threonine, tryosine, cysteine, 
asparagine, and glutamine); basic (lysine, arginine, and histidine); and acidic (aspartic 
acid and glutamic acid). 

Generally, in substituting different amino acids it is preferable to 

30 exchange amino acids having similar properties. Substituting different amino acids 
within a particular group, such as substituting valine for leucine, arginine for lysine, 
and asparagine for glutamine are good candidates for not causing a change in 
polypeptide tertiary structure. 

Starting with a particular amino acid sequence and the known 

35 degeneracy of the genetic code, a large number of different encoding nucleic acid 
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sequences can be obtained. The degeneracy of the genetic code arises because almost . ... 

all amino acids are encoded by different combinations of nucleotide triplets or NV -v : . . 

"codons". The translation of a particular codon into a particular amino acid is well ' ; : ; \t. 

known in the art (see, e.g., Lewin GENES IV, p. 119, Oxford University Press, 1990). H 'rtiy^- 
1 5 Amino acids are encoded by codons as follows: 

A=Ala=Alanine: codons GCA, GCC, GCG, GCU 

C=Cys=Cysteine: codons UGC, UGU 

D=Asp=Aspartic acid: codons GAC, GAU 

E=Glu=Glutamic acid: codons GAA, GAG 
10 F=Phe=Phenylalanine: codons UUC, UUU 

Gf==Gly=Glycine: codons GGA, GGC, GGG, GGU 

H=His==Histidine: codons CAC, CAU 

i=Ee=Isoleucine: codons AUA, AUC, AUU 

K=Lys=Lysine: codons AAA, AAG 
15 L=Leu=Leucine: codons UUA, UUG, CUA, CUC, CUG, CUU 

M=Met=Methionine: codon AUG 

N=Asn=Asparagine: codons A AC, AAU 

P=Pro^Proline: codons CCA, CCC, CCG, CCU 

Q=GIri=Glutamirie: codons CAA, CAG 
20 R=Arg=Arginine: codons AGA, AGG, CGA, CGC, CGG, CGU 

S=Ser=Serihe: codons AGC, AGU, UCA, UCC, UCG, UCU ; 

T=Thr=Threonine: codons ACA, ACC, ACG; ACU 

V=Val=Valine: codons GUA, GUC, GUG, GUU 

W=Trp=Tryptophan: codon UGG 
25 Y=Tyr=Tyrosirie: codons UAC, UAU, 

Nucleic acid sequences can be optimized in an effort to enhance ! V\ 

expression in a host Factors to be considered include C:G content, preferred codons,: /\ , YV 

and the avoidance of inhibitory secondary structure. These factors can be combined 

in different ways in an attempt to obtain nucleic acid sequences having enhanced ; * 

30 expression in a particular host (See, for example, Donnelly et al., International T V ' ••;;•( r ; 

Publication Number WO 97/47358.) 

The ability of a particular sequence to have enhanced expression in a ; . f ;2 ;o %\ . ^ 

particular host involves some empirical experimentation , Such experimentation 

involves measuring expression of a prospective nucleic acid sequence and, if needed, 
35 altering the sequence^ 

15 ?: : 
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B. Encoding Nucleotide Sequences 

SEQ. ID. NOs. 2 and 3 provide two examples of nucleotide sequences 

encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B sequence. The coding sequence of 
5 SEQ. ID. NO. 2 is similar (99.4% nucleotide sequence identity) to the NS3-NS4A- 

NS4B-NS5 A-NS5B region of the naturally occurring HC V-BK sequence (GenBank 

accession number M58335). SEQ. ID. NO. 3 is a codon-optimized version of SEQ. 

ID. NO. 2. SEQ. ID. NOs. 2 and 3 have a nucleotide sequence identity of 78.3%. 

Differences between the HCV-BK NS3-NS4A-NS4B-NS5A-NS5B 
10 nucleotide (GenBank accession number M58335) and SEQ. ID. NO. 2, include SEQ. 

ID. NO. 2 having a ribosome binding site, an ATG methionine codon, a region coding 

for a modified NS5B catalytic domain, a TAAA stop signal and an additional 30 

nucleotide differences. The modified catalytic domain codes for a AlaAlaGly 

(residues 1711-1713) instead of GlyAspAsp to inactivate NS5B. 
15 A nucleotide sequence encoding a HCVMet-NS3-NS4A-NS4B- 

NS5A-NS5B polypeptide is preferably substantially similar to the SEQ. ID. NO. 2 

coding region. In different embodiments, the nucleotide sequence encoding a HCV 

Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide has a nucleotide sequence identify 

to the SEQ. ID. NO. 2 coding region of at least 65%, at least 75%, at least 85%, at 
20 least 95%, at least 99%, or 100%; or differs from SEQ. ID. NO. 2 by 1-2, 1-3, 1-4, 1- 

5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 

1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 nucleotides. 

Nucleotide differences between a sequence coding Met-NS3-NS4A- 

NS4B-NS5A-NS5B and the SEQ. ID. NO. 2 coding region are calculated by 
25 determining the minimum number of nucleotide modifications in which the two 

sequences differ. Nucleotide modifications can be deletions, additions, substitutions 

or any combination thereof. 

Nucleotide sequence identity is determined by methods well known in 

the art that compare the nucleotide sequence of one sequence to the nucleotide 
30 sequence of a second sequence and generate a sequence alignment. Sequence identity 

is determined from the alignment by counting the number of aligned positions having 

identical nucleotides. 

Methods for determining nucleotide sequence identity between two 

polynucleotides include those described by Schuler, in Bioinformatics: A Practical 
35 Guide to the Analysis of Genes and Proteins, Baxevanis, A.D. and Ouelette, B.F.F., 
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eds., John Wiley & Sons, Inc, 2001; Yona et al.,. in Bioinformatics: Sequence, 
structure and databanks, Higgins, D. and Taylor, W. eds, Oxford University Press, 
2000; and Bioinformatics: Sequence and Genome Analysis, Mount, D.W., ed., Cold 
Spring Harbor Laboratory Press, 2001 ). Methods to determine nucleotide sequence 
5 identity are codified in publicly available computer programs such as GAP 

(Wisconsin Package Version 10.2, Genetics Computer Group (GCG), Madison, 
• Wise.), BLAST (Altschul et al, /. Mol Biol 21 5( 5j:403-10 ; 1990), and FASTA 
(Pearson, W.R., Methods in Enzymology 753:63-98, 1990, R.F. Doolittle, ed.). 

In an embodiment of the present ivnention, sequence identity between > 

10 two polynucleotides is determined by application of GAP (Wisconsin Package 
Version 10.2, Genetics Computer Group (GCG), Madison, Wise.). GAP uses the 
alignment method of Needleman and Wunsch. (Needlernan et al. t 7. Mol BioL 
45:443-453, 1970.) GAP considers all possible alignments and gap positions between 
two sequences and creates a global alignment that maximizes the number of matched 

15 residues and minimizes the number and size of gaps. A scoring matrix is used to 
assign values for symbol matches. In addition, a gap creation penalty and a gap 
extension penalty are required to limit the insertion of gaps: into the alignment. 
Default program parameters for polynucleotide comparisons using GAP are the 
nwsgapdna.cmp scoring matrix (MATrix=nwsgapdna.cmp), a gap creation parameter 

20 (GAPweight=50) and a gap extension pararameter (LENgth weight=3): • - 

More preferred HCV Met-NS3-NS4A-NS4B-NS5A-NS5B nucleotide 
sequences in addition to being substantially similar across its entire length, produce- 
individual NS3, NS4A, NS4B, NS5A and NS5B regions that are substantially similar 
to the corresponding regions present in SEQ. ID. NO. 2. The corresponding coding 

25 regions in SEQ. ID. NO. 2 are provided as follows: Met-NS3, nucleotides 7-1902^ A ' 
NS4A nucleotides 1903-2064; NS4B nucleotides 2065-2847; NS5A nucleotides 
2848-4188: NS5B nucleotides 4189-5661. 

In different embodiments a NS3, NS4A, NS4B, NS5 A and/or NS5B 
encoding region has a nucleotide sequence identity to the corresponding region in 

30 SEQ. ID. NO. 2 of at least 65%, at least 75%, at least 85%, at least 95%, at least 99% 
or 100%; or a nucleotide difference to SEQ, ID. NO. 2 of 1-2, 1-3, 1-4* 1-5, 1-6, 1-7, 
1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 1-25, 1-30, 
1-35, 1-40, 1-45, or 1-50 nucleotides. 

35 . 
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C. Gene Expression Cassettes 

A gene expression cassette contains elements needed for polypeptide 
expression. Reference to "polypeptide" does not provide a size limitation and 
includes protein. Regulatory elements present in a gene expression cassette generally 

5 include: (a) a promoter transcriptionally coupled to a nucleotide sequence encoding 
the polypeptide, (b) a 5' ribosome binding site functionally coupled to the nucleotide 
sequence, (c) a terminator joined to the 3* end of the nucleotide sequence, and (d) a 3' 
polyadenylation signal functionally coupled to the nucleotide sequence. Additional 
regulatory elements useful for enhancing or regulating gene expression or polypeptide 

10 processing may also be present. 

Promoters are genetic elements that are recognized by an RNA 
polymerase and mediate transcription of downstream regions. Preferred promoters 
are strong promoters that provide for increased levels of transcription. Examples of 
strong promoters are the immediate early human cytomegalovirus promoter (CMV), 

15 and CMV with intron A. (Chapman et al, Nucl Acids Res. 19:3979-3986, 1991.) 
Additional examples of promoters include naturally occurring promoters such as the 
EF1 alpha promoter, the murine CMV promoter, Rous sarcoma virus promoter, and 
SV40 early/late promoters and the p-actin promoter; and artificial promoters such as a 
synthetic muscle specific promoter and a chimeric muscle-specific/CMV promoter (Ii 

20 et al t Nat. BiotechnoL 77:241-245, 1999, Hagstrom et aL, Blood 95:2536-2542, 
2000). 

The ribosome binding site is located at or near the initiation codon. 
Examples of preferred ribosome binding sites include CCACCAUGG, 
CCGCCAUGG, and ACCAUGG, where AUG is the initiation codon. (Kozak, Cell 
25 44:283-292, 1986). Another example of a ribosome binding site is GCCACCAUGG 
(SEQ. ID. NO. 12). 

The polyadenylation signal is responsible for cleaving the transcribed 
RNA and the addition of a poly (A) tail to the RNA. The polyadenylation signal in 
higher eukaryotes contains an AAUAAA sequence about 11-30 nucleotides from the 
30 polyadenylation addition site. The AAUAAA sequence is involved in signaling RNA 
cleavage. (Lewin, Genes IV, Oxford University Press, NY, 1990.) The poly (A) tail 
is important for the mRNA processing. 

Polyadenylation signals that can be used as part of a gene expression 
cassette include the minimal rabbit P -globin polyadenylation signal and the bovine 
35 growth hormone polyadenylation (BGH). (Xu et al, Gene 272:149-156, 2001, Post et 
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al. t U.S. Patent U. S. 5,122,458.) Additional examples include the Synthetic 
Polyadenylation Signal (SPA) and SV40 polyadenylation signal. The SPA sequence r 
is as follows: AAUAAAAGAUCUUUAUUUUCAUUAGAUCUGUGUG , 
UUGGUUUUUUGUGUG (SEQ. ID. NO. 13). . 
5 Examples of additional regulatory elements useful for enhancing or ; 

regulating gene expression or polypeptide processing that may be present include an 
enhancer, a leader sequence and an operator. An enhancer region increases 
transcription. Examples of enhancer regions include the CMV enhancer and the 
SV40 enhancer. (Hitt et al, Methods in Molecular Genetics 7: 13-30, 1995, Xu, et al, 

10 Gene 272: 149-156, 2001.) An enhancer region can be associated with a promoter. 

A leader sequence is an amino acid region on a polypeptide that directs 
the polypeptide into the proteasome. Nucleic acid encoding the leader sequence is 5 V 
of a structural gene and is transcribed along the structural gene. An example of a 
leader sequences is tP A. 

15 An operator sequence can be used to regulate gene expression, For 

example, the Tet operator sequence can be used to repress gene expression. 

n. THERAPEUTIC VECTORS 
Nucleic acid encoding a Met-NS3-NS4A-NS4B-NS5A : NS5B 
20 polypeptide can be introduced into a patient using vectors suitable for therapeutic; 
administration. Suitable vectors can deliver nucleic acid into a target cell without 
causing ah unacceptable side effect. 

Cellular expression is achieved using a gene expression cassette 
encoding a Met-NS3«NS4A-NS4B-NS5A-NS5B polypeptide. The gene expression 
25 cassette contains regulatory elements for producing and processing a sufficient 
amount of nucleic acid inside a target cell to achieve a beneficial effect. 

Examples of vectors that can be used for therapeutic applications 
include first and second generation adeno vectors, helper dependent adenoviectors, 
adeno-associated viral vectors, retroviral vectors, alpha virus vectors, Venezuelan 
30 Equine Encephalitis virus vector, and plasmid vectors. (Hitt, et al> Advances in 
Pharmacology 40: 137-206, 1997, Johnston et al., U.S. Patent No. 6,156,588^ and 
Johnston et al., International Publication Number WO 95/32733.) Preferred vectors 
for introducing a Met-NS3-NS4A-NS4B-NS5AJNTS5B polypeptide into a subject are 
first generation adenoviral vectors and plasmid DNA vectors. 

35 
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A. First Generation Adenovectors 

First generation adenovector for expressing a gene expression cassette 
contain the expression cassette in an El and optionally E3 deleted recombinant 
adenovirus genome. The deletion in the El region is sufficiently large to remove 
5 elements needed for adenoviral replication. 

First generation adenovectors for expressing a Met-NS3-NS4A-NS4B- 
NS5A-NS5B polypeptide contain a El and E3 deleted recombinant adenovirus 
genome. The deletion in the El region is sufficiently large to remove elements 
needed for adenoviral replication. The combinations of deletions of the El and E3 
10 regions are sufficiently large to accommodate a gene expression cassette encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. 

The adenovirus has a double-stranded linear genome with inverted 
terminal repeats at both ends. During viral replication, the genome is packaged inside 
a viral capsid to form a virion. The virus enters its target cell through viral attachment 
1 5 followed by internalization. (Hi tt et al., Advances in Pharmacology 40: 137-206, 
1997.) 

Adenovectors can be based on different adenovirus serotypes such as 
those found in humans or animals. Examples of animal adenoviruses include bovine, 
porcine, chimp, murine, canine, and avian (CELO). Preferred adenovectors are based 
20 on human serotypes, more preferably Group B, C, or D serotypes. Examples of 

human adenovirus Group B, C, D, or E serotypes include types 2 ("Ad2"), 4 ("Ad4"), 
5 ("Ad5"), 6 CAd6"), 24 ("Ad24"), 26 C'Ad26"), 34 C'Ad34") and 35 ("Ad35")- 
Adenovectors can contain regions from a single adenovirus or from two or more 
adenovirus. 

25 In different embodiments adenovectors are based on Ad5, Ad6, or a 

combination thereof. Ad5 is described by Chroboczek, et al. 9 J. Virology 786:280- 
285, 1992, Ad6 is described in Figures 7A-7N. An Ad6 based vector containing 
Ad5 regions is described in the Example section provided below. 

Adenovectors do not need to have their El and E3 regions completely 

30 removed. Rather, a sufficient amount the El region is removed to render the vector 
replication incompetent in the absence of the El proteins being supplied in trans; and 
the El deletion or the combination of the El and E3 deletions are sufficiently large 
enough to accommodate a gene expression cassette. 

El deletions can be obtained starting at about base pair 342 going up to 

35 about base pair 3523 of Ad5, or a corresponding region from other adenoviruses. 
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Preferably, the deleted region involves removing a region from about base pair 450 to 
about base pair 351 1 of Ad5, or a corresponding region from other adenoviruses. 
Larger El region deletions starting at about base pair 341 removes elements that 
facilitate virus packaging. 
5 E3 deletions can be obtained starting at about base pair 27865 to abbut 

base pair 30995 of Ad5, or the corresponding region of other adenovectors. 
Preferably the deletion region involves removing a region from about base pair 28134 
iip to about base pair 30817 of Ad5, or the corresponding region of other 
adenovectors. 

10 The combination of deletions to the El region and optionally the E3 

region should be sufficiently large so that the overall size of the recombinant genome 
containing the gene expression cassette does not exceed about 105% of the wild type 
; adenovirus genome. For example, as recombinant adenovirus Ad5 genomes increase 
size above about 105% the genome becomes unstable. (Beft et al, Journal of 

15 Virology 67:591 1-5921, 1993.) 

Preferably, the size of the recombinant adenovirus genome containing 
the gene expression cassette is about 85% to about 105% the size of the wild type 
adenovirus genome. In different embodiments, the size of the recombinant 
adenovirus genome containing the expression cassette is about 100% to about 

20 105.2%, of about 100%, the size of the wild type genome. * \ . i . . 

Approximately 7,500 kb can be inserted into an adenovirus genome 
With a El and E3 deletion. Without any deletion, the Ad5 genome is 35,935 base 
pairs arid the Ad6 genome is 35,759 base pairs. 

Replication of first generation adenovectors can be performed by 

25 supplying the El gene products in trans. The El gene product can be supplied in 

trans, for example, by using cell lines that have been transformed with the adenovirus; 
El region. Examples of cells and cells lines transformed with the adenovirus El 
region are HEK 293 cells, 911 cells, PERC.6™ cells, and transfected pririiary humeri 
amiriocytes ceils. (Graham et al, Journal of Virology 35:59-72, 1977, Schiedner et • 

30 al, Human Gene Therapy J J:2105-21 16, 2000, Fallaux et al, Human Gene Therapy. 
9:1909-1917, 1998, Bbute/ al, U.S. PatentNo. 6,033,908.) 

A Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette should be 
inserted into a recombinant adenovirus genome in the region corresponding to the 
deleted El region or the deleted E3 region. The expression cassette can have a 

35 parallel or anti-parallel orientation. In a parallel orientation the transcription direction 
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of the inserted gene is the same direction as the deleted El or E3 gene. In an anti- 
parallel orientation transcription the opposite strand serves as a template and the 
transcription direction is in the opposite direction. 

In an embodiment of the present invention the adenovector has a gene 
5 expression cassette inserted in the El deleted region. The vector contains: 

a) a first adenovirus region from about base pair 1 to about base \ 
pair 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

10 c) a second adenovirus region from about base pair 35 1 1 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

15 28156 corresponding to Ad6, joined to the second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to the third region; and 

f) a fifth adenovirus region from about base pair 33967 to about 
20 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6 joined to the fourth region. 

In another embodiment of the present invention the adenovector has an 
expression cassette inserted in the E3 deleted region. The vector contains: 

a) a first adenovirus region from about base pair 1 to about base 
25 pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 3511 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6 Joined to the first region; 

c) a third adenovirus region from about base pair 5549 to about 
30 base pair 28 133 corresponding to Ad5 or from about base pair 5542 to about base pair 

28156 corresponding to Ad6, joined to the second region; 

d) a gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 
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e) a fourth adenovirus region from about base pair 3081 8 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base ; 
pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
5 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6, joined to the fourth region. 

In preferred different embodiments concerning adenovirus regions that 
are present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) 
the first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first 
10 region corresponds to Ad5, the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
corresponds to Ad5, 

B. DNA Plasmid Vectors 
15 DNA vaccine plasmid vectors contain a gene expression cassette along 

with elements facilitating replication and preferably vector selection. Preferred 

elements provide for replication in non-mammalian cells and a selectable marker. 

The vectors should not contain elements providing for replication in human cells or 

for integration into human nucleic acid. 
20 ' The selectable marker facilitates selection of nucleic acids containing 

the marker. Preferred selectable markers are those that confer antibiotic resistance. 

Examples of antibiotic selection genes include nucleic acid encoding resistance to 

ampicillin, neomycin, and kanamycin. 

Suitable DNA vaccine vectors can be produced starting with a plasmid 
25 containing a bacterial origin of replication and a selectable marker. Examples of 

bacterial origins of replication providing for higher yields include the ColEl plasmid- 

derived bacterial origin of replication. (Donnelly etal., Annu. Rev. Immunol J 5:617- 

648, 1997.) 

The presence of the bacterial origin of replication and selectable 
30 marker allows for the production of the DNA vector in a bacterial strain such as E. 
coll The selectable marker is used to eliminate bacteria not containing the DNA 
vector. 
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IH. AD6 RECOMBINANT NUCLEIC ACID 
Ad6 recombinant nucleic acid comprises an Ad6 region substantially 
similar to an Ad6 region found in SEQ. ID. NO. 8, and a region not present in Ad6 
nucleic acid. Recombinant nucleic acid comprising Ad6 regions have different uses 
5 such as in producing different Ad6 regions, as intermediates in the production of Ad6 
based vectors, and as a vector for delivering a recombinant gene. 

As depicted in Figure 9, the genomic organization of Ad6 is very 
similar to the genomic organization of Ad5. The homology between Ad5 and Ad6 is 
approximately 98%. 

10 In different embodiments, the Ad6 recombinant nucleic acid comprises 

a nucleotide region substantially similar to El A, E1B, E2B, E2A, E3, E4, L1,L2, L3, 
or L4, or any combination thereof. A substantially similar nucleic acid region to an 
Ad6 region has a nucleotide sequence identity of at least 65%, at least 75%, at least 
85%, at least 95%, at least 99% or 100%; or a nucleotide difference of 1-2, 1-3, 1-4,. 

15 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 
1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 nucleotides. Techniques and embodiments for 
determining substantially similar nucleic acid sequences are described in Section LB. 
supra. 

Preferably, the recombinant Ad6 nucleic acid contains an expression 
20 cassette coding for a polypeptide not found in Ad6. Examples of expression cassettes 
include those coding for HCV regions and those coding for other types of 
polypeptides. 

Different types of adenoviral vectors can be produced incorporating 
different amounts of Ad6, such as first and second generation adenovectors. As noted 
25 in Section n. A. supra, first generation adenovectors are defective in El and can 
replicate when El is supplied in trans. 

Second generation adenovectors contain less adenoviral genome than 
first generation vectors and can be used in conjugation with complementing cell lines 
and/or helper vectors supplying adenoviral proteins. Second generation adenovectors 
30 are described in different references such as Russell, Journal of General Virology 
37:2573-2604, 2000; Hitt et al, 1997, Human Ad vectors for Gene Transfer, 
Advances in Pharmacology, Vol 40 Academic Press. 

In an embodiment of the present invention, the Ad6 recombinant 
nucleic acid is an adenovirus vector defective in El that is able to replicate when El is 
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supplied in trans. Expression cassettes can be inserted into a deleted El region and/or^ 
a deleted E3 region. 

An example of an Ad6 based adenoviral vector with an expression 
cassette provided in a deleted El region comprises or Consists of: 
5 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; m% 

c) a second adenovirus region from about base pair 351 1 to about; 
10 base pair 554$ corresponding to Ad5 or from about base pair 3508 to about base pair 

5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second regibn; 

15 e) an optionally present fourth region from about base pair 28 134 

to about base pair 30817 corresponding to Ad5, or from about base pair 28157 to 
about base pair 30788 corresponding to Ad6, joined to the third region; ^ • 

f) a fifth adenovirus region from about base pair 30818 to about . 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

20 pair 33784 corresponding to Ad6, wherein.the fifth region is joined to the fourth 
region if the fourth region is present, or the fifth is joined to the third region if the 
fourth region is not present; and 

g) a sixth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base - . 

25 pair 35759 corresponding to Ad6, joined to the fifth region; 

wherein at least one Ad6 region is present. 

In different embodiments of the invention* all of the regions are from 
Ad6; all of the regions expect for the first and second are from Ad6; and 1, 2, 3, of 4 
regions selected from the second, third, fourth, and fifth regions are from Ad& • A 
30 An example of an Ad6 based adenoviral vector with an expression 

cassette provided in a deleted E3 region comprises of consists of: 

a) a first adenovirus region from about base pair 1 to about base, 
pair 450 corresponding to either Ad5 of Ad6; 
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b) a second adenovirus region from about base pair 35 1 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the first region; 

c) a third adenovirus region from about base pair 5549 to about 

5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

d) a gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 

e) a fourth adenovirus region from about base pair 30818 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region; 

1 5 wherein at least one Ad6 region is present. 

In different embodiment of the invention, all of the regions are from 
Ad6; all of the regions expect for the first and second are from Ad6; and 1, 2, 3, or 4 
regions selected from the second, third, fourth and fifth regions are from Ad6. 

20 IV T VECTOR PRODUCTION , 

Vectors can be produced using recombinant nucleic acid techniques 
such as those involving the use of restriction enzymes, nucleic acid ligation, and 
homologous recombination. Recombinant nucleic acid techniques are well known in 
the art.. (Ausubel, Current Protocols in Molecular Biology, John Wiley, 1987-1998, 

25 and Sambrook et a/., Molecular Cloning, A Laboratory Manual, 2 nd Edition, Cold 
Spring Harbor Laboratory Press, 1989.) 

Intermediate vectors are used to derive a therapeutic vector or to 
transfer an expression cassette or portion thereof from one vector to another vector. 
Examples of intermediate vectors include adenovirus genome plasmids and shuttle 

30 vectors. 

Useful elements in an intermediate vector include an origin of 
replication, a selectable marker, homologous recombination regions, and convenient 
restriction sites. Convenient restriction sites can be used to facilitate cloning or release 
of a nucleic acid sequence. 
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Homologous recombination regions provide nucleic acid sequence; 
regions that are homologous to a target region in another nucleic acid molecule. The 
homologous regions flank the nucleic acid sequence that is being inserted into the 
target region. In different embodiments homologous regions are preferably about 150 

5 to 600 nucleotides in length, or about 100 to 500 nucleotides in length. 

An embodiment of the present invention describes a shuttle vector 
containing a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette, a selectable 
marker, a bacterial origin of replication, a first adenovirus homology region and a 
second adenovirus homologous region that target the expression cassette to insert in 

10 or replace an El region: The first and second homology regions flank the expression 
cassette. The first homology region contains at least about 100 base pairs 
substantially homologous to at least the right end (3' end) of a wild-type adenovirus 
region from about base pairs 4-450. The second homology contains at least about 100 
base pairs substantially homologous to at least the left end (5' end) of Ad5 from about 

1 5 base pairs 35 1 1-5792, or the corresponding region from another adenovirus. 

Reference to "substantially homologous" indicate a sufficient degree 
of homology to specifically recombine with a target region. In different embodiments 
substantially homologous refers to at least 85%* at least 95%, or 100% sequence 
identity. Sequence identity can be calculated as described in Section IB. suprd 

20 One method of producing adenovectors is through the creation of an 

adenovirus genome plasmid containing an expression cassette. The pie-Adenovirus 
plasmid contains all the adenovirus sequences needed for replication in the desired 
complimenting cell line. The pre-Adenovirus plasmid is then digested with a 
restriction enzyme to release the viral ITR's and transfected into the complementing 

25 cell line for virus rescue. The ITR's must be released from plasmid sequences to 
allow replication to occur. Adenovector rescue results in the production on an 
adenovector containing the expression cassette. 

A. Adenovirus Genome Plasmids 

30 Adenovirus genome plasmids contain an adenovector sequence inside" 

a longer-length plasmid (which may be a cosmid). The longer-length plasmid may ; 
contain additional elements such as those facilitating growth and selection in 
eukaryotic or bacterial cells depending upon the procedures employed to produce and 
maintain the plasmid. Techniques for producing adenovirus genome plasmids include 

35 those involving the use of shuttle vectors and homologous recombination, and those V" 
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involving the insertion of a gene expression cassette into an adenovirus cosmid. (Hitt 
et al, Methods in Molecular Genetics 7:13-30, 1995, Danthinne et al, Gene Therapy 
7:1707-1714,2000.) 

Adenovirus genome plasmids preferably have a gene expression 
5 cassette inserted into a El or E3 deleted region. In an embodiment of the present 
invention, the adenovirus genome plasmid contains a gene expression cassette 
inserted in the El deleted region, an origin of replication, a selectable marker, and the 
recombinant adenovirus region is made up of: 

a) a first adenovirus region from about base pair 1 to about base 
10 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

c) a second adenovirus region from about base pair 35 1 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

15 5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28 156 corresponding to Ad6, joined to the second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
20 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the third region; 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region, and 

25 g) an optionally present E3 region corresponding to all or part of 

the E3 region present in Ad5 or Ad6 t which may be present for smaller inserts taking 

into account the overall size of the desired adenovector. 

In another embodiment of the present invention the recombinant 

adenovirus genome plasmid has the gene expression cassette inserted in the E3 
30 deleted region. The vector contains an origin of replication, a selectable marker, and 

the following: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
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b) a second adenovirus region from about base pair 3511 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the expression cassette; 

- c) a third adenovirus region from about base pair 5549 to about 
5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28 156 corresponding to Ad6, joined to the second region; 

d) the gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 

e) a fourth adenovirus region from about base pair 308 1 8 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the gene expression cassette; and ; 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base? . 1 
pair 35759 corresponding to Ad6, joined to the fourth region. 

15 In different embodiments concerning adenovirus regions that are 

present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) the 
first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first regidn 
corresponds to Ad5, the second region corresponds to Ad5* the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 

20 corresponds to Ad5. ../.-"•--..* 

An embodiment of the present invention describes a method of making 
an adenovector involving a homologous recombination step to produce a adenovirus 
genome plasmid and an adenovirus rescue step. The homologous recombination step 
involves the use of a shutde vector containing a Met-NS3-NS4A-NS4B-NS5 A-NS5B 

25 expression cassette flanked by adenovirus homology regions: The adenovirus 
homology regions target the expression cassette into either the El or E3 deleted 
region. 

In an embodiment of the present invention concerning the production 
of an adenovirus genome plasmid; the gene expression cassette is inserted into ai 

30 vector comprising: a first adenovirus region from about base pair 1 to about base pair 
450 corresponding t<5 either Ad5 or Ad6; a second adenovirus region from about base 
pair 351 1 to about base pair 5548 corresponding to Ad5 or from about base pair 3508' 
to about base pair 5541 corresponding to Ad6, joined to the second region; a third 
adenovirus region from about base pair 5549 to about base pair 28133 corresponding 

35 to Ad5 or from about base pair 5542 to about base pair 28156 corresponding to Ad6, 



WO 03/031588 



PCT/US02/32512 



joined to the second region; a fourth adenovirus region from about base pair 30818 to 
about base pair 33966 corresponding to Ad5 or from about base pair 30789 to about 
base pair 33784 corresponding to Ad6 Joined to the third region; and a fifth 
adenovirus region from about 33967 to about 35935 corresponding to Ad5 or from 
5 about base pair 33785 to about base pair 35759 corresponding to Ad6, joined to the 
fourth region. The adenovirus genome plasmid should contain an origin of replication 
and a selectable marker, and may contain all or part of the Ad5 or Ad6 E3 region. 

In different embodiments concerning adenovirus regions that are 
present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) the 
10 first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first region 
corresponds to Ad5, the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
corresponds to Ad5. 

15 B. Adenovector Rescue 

An adenovector can be rescued from a recombinant adenovirus 
genome plasmid using techniques known in the art or described herein. Examples of 
techniques for adenovirus rescue well known in the art are provided by Hitt et al t 
Methods in Molecular Genetics 7: 13-30, 1995, and Danthinne et al. t Gene Therapy 

20 7:1707-1714,2000. 

A preferred method of rescuing an adenovector described herein 
involves boosting adenoviral replication. Boosting adenoviral replication can be 
performed, for example, by supplying adenoviral functions such as E2 proteins 
(polymerase, pre-terminal protein and DNA binding protein) as well as E4 orf6 on a 

25 separate plasmid. Example 10 infra, illustrates the boosting of adenoviral replication 
to rescue an adenovector containing a codon optimized Met-NS3-NS4A-NS4B- 
NS5A-NS5B expression cassette. 

V. PARTTAL-OPTTIMTZRD HCV ENCODIN G SEQUENCES 
30 Partial optimization of HCV polyprotein encoding nucleic acid 

provides for a lesser amount of codons optimized for expression in a human than 
complete optimization. The overall objective is to provide the benefits of increased 
expression due to codon optimization, while facilitating the production of an 
adenovector containing HCV polyprotein encoding nucleic acid having optimized 
35 codons. 
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Complete optimization of an HCV poiyprotein encoding sequence > ? 
provides the most frequently observed human codon for each amino acid Complete 
optimization can be performed using codon frequency tables well known in the art ;.\ ; 
and using programs such as the BACKTRANSLATE program (Wisconsin Package 
5 ; version 10, Genetics Computer Group, GCG, Madison, Wise.). ; 

Partial optimization can be preformed on ari entire HCV poiyprotein , 
encoding sequence that is present {e.g., NS3-NS5B), or one or more local regions that arei - 
present In different embodiments the GC content for the entire HCV encoded polyprpteii^ 
that is present is no greater than at least about 65%; and the GC content for one or mbre \ ; 
10 local regions is no greater than about 70%. 

Local regions are regions present in HCV encoding nucleic acid, and can" ^ 
vary in size. For example, local regions can be about 60, about 70, about 80, about 90 or 
about 100 nucleotides in length. K^tfH^ 
Partial optimization can be achieved by initially constructing an HCV ■ ; ( 
15 encoding poiyprotein sequence to be partially optimized based on a naturally ocurring 

sequence. Alternatively, an optimized HCV encoding sequence can be used as basis^fft| % 
comparison to produce a partial optimized sequence. 

VI. HCV COMBINATION TREATMENT 

20 The HCV Met-NS3-NS4A-NS4B-NS5A-NS5B vaccine dan be used by : '■<[-, 

itself to treat a patient, can be used in conjunction with other HCV therapeutics, and ? 4J j 
can be used with agents targeting other types of diseases. Additional therapeutics 
include additional therapeutic agents to treat HCV and diseases having a high 
prevalence in HCV infected persons. Agents targeting other types of disease inducted '*V\ 

25 vaccines directed against HIV and HBV. " .'■ 

Additional therapeutics for treating HCV include vaccines and non- ' \ , 
vaccine agents. (Zein, Expert Opin. Investig. Drugs 70:1457-1469* 2001.) Examples" ' 
of additional HCV vaccines include vaccines designed to elicit an immune response : ' ' 
against an HCV core antigen and the HCV El, E2 or p7 region. Vaccine component^ 

30 can be naturally occurring HCV polypeptides, HCV miiriotope polypeptides or nucleic 
acid encoding such polypeptides; ; ' v .£. ^ 

HCV inifnotdpe polypeptides contain HCV epitopes, but haVe a *"] ' 
different sequence than a naturally occurring HCV antigen, A HCV mimotope can be ; 
fused to a naturally occurring HCV antigen. References describing techniques for 

35 producing mimotopes in general and describing different HCV mimbtopes are 
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provided in Felici et al. U.S. Patent No. 5,994,083 and Nicosia et al., International 
Application Number WO 99/60132. 

Vn. PHARMACEUTICAL ADMINISTRATION 
5 HCV vaccines can be formulated and administered to a patient using 

the guidance provided herein along with techniques well known in the art Guidelines 
for pharmaceutical administration in general are provided in, for example, Modern 
Vaccinology> Ed. Kurstak, Plenum Med. Co. 1994; Remington 's Pharmaceutical 
Sciences 18> h Edition, Ed. Gennaro, Mack Publishing, 1990; and Modern 
10 Pharmaceutics 2 nd Edition, Eds. Banker and Rhodes, Marcel Dekker, Inc., 1990, each 
of which are hereby incorporated by reference herein. 

HCV vaccines can be administered by different routes such 
intravenous, intraperitoneal, subcutaneous, intramuscular, intradermal, impression 
through the skin, or nasal. A preferred route is intramuscular. 
15 Intramuscular administration can be preformed using different 

techniques such as by injection with or without one or more electric pulses. Electric 
mediated transfer can assist genetic immunization by stimulating both humoral and 
cellular immune responses. 

Vaccine injection can be performed using different techniques, such as 
20 by employing a needle or a needless injection system. An example of a needless 
injection system is a jet injection device. (Donnelly et al., International Publication 
Number WO 99/52463.) 

A. Electrically Mediated Transfer 

25 Electrically mediated transfer or Gene Electro-Transfer (GET) can be 

performed by delivering suitable electric pulses after nucleic acid injection. (See 
Mathiesen, International Publication Number WO 98/43702). Plasmid injection and 
electroporation can be performed using stainless needles. Needles can be used in 
couples, triplets or more complex patterns. In one configuration the needles are 

30 soldered on a printed circuit board that is a mechanical support and connects the 
needles to the electrical field generator by means of suitable cables. 

The electrical stimulus is given in the form of electrical pulses. Pulses 
can be of different forms (square, sinusoidal, triangular, exponential decay) and 
different polarity (monopolar of positive or negative polarity, bipolar). Pulses can be 

35 delivered either at constant voltage or constant current modality. 
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Different patterns of electric treatment can be used to introduce nucleic, 
acid vaccines including HCV and other nucleic acid vaccines into a patient Possible; ' 
patterns of electric treatment include the following: 

Treatment 1: 10 trains of 1000 square bipolar pulses delivered every ; 
5 other second, pulse length 0.2 msec/phase, frequency 1000 Hz, constant voltage 

mode, 45 Volts/phase, floating current. • 
Treatment 2: 2 trains of 100 square bipolar pulses delivered every other* 
second, pulse length 2 msec/phase, frequency 100 Hz f constant current mode, 100 
mA/phase, floating voltage. 
10 Treatment 3: 2 trains of bipolar pulses at a pulse length of about 2 

msec/phase, for a total length of about 3 seconds, where the actual current going 
through the tissue is fixed at about 50 mA. 

Electric pulses are delivered through an electric field generator. A 
suitable generator can be composed of three independent hardware elements 
15 assembled in a common chassis and driven by a portable PC which runs the driving 
program. The software manages both basic and accessory functions. The elements of 
the device are: (1) signal generator driven by a microprocessor, (2) power amplifier 
and (3) digital oscilloscope. 

The signal generator delivers signals having arbitrary frequency and 
20 shape in a given range under software control. The same software has an interactive 
editor for the waveform to be delivered. The generator features a digitally controlled 
current limiting device (a safety feature to control the maximal current output). The 
power amplifier can amplify the signal generated up to +/- 150 V. The oscilloscope is 
digital and is able to sample both the voltage and the current being delivered by the 
25 amplifier. 

B/ Pharmaceutical Carriers 

Pharmaceutically acceptable carriers facilitate storage and 
administration of a vaccine to a subject Examples of pharmaceutically acceptable 
30 earners are described herein. Additional pharmaceutical acceptable carriers are well 
known in the art. 

Pharmaceutically acceptable carriers may contain different components, 
such a buffer, normal saline or phosphate buffered saline, sucrose* salts and 
polysorbate. An example of a pharmaceutically acceptable carrier is follows: 2.5-10 
35 mM TRIS buffer, preferably about 5 mM TRIS buffer, 25-100 mM NaCl; preferably 
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about 75 mM NaCl; 2.5-10% sucrose, preferably about 5% sucrose; 0.01 -2 mM 
MgCl 2 ; and 0.001%-0.01% polysorbate 80 (plant derived). The pH is preferably from 
about 7.0-9.0, more preferably about 8.0. A specific example of a carrier contains 5 
mM TRIS, 75 mM NaCl, 5% sucrose, 1 mM MgCl 2 , 0.005% polysorbate 80 at pH 
5 8.0. 

G. Dosing Regimes 

Suitable dosing regimens can be determined taking into account the 
efficacy of a particular vaccine and factors such as age, weight, sex and medical 
10 condition of a patient; the route of administration; the desired effect; and the number 
of doses. The efficacy of a particular vaccine depends on different factors such as the 
ability of a particular vaccine to produce polypeptide that is expressed and processed 
in a cell and presented in the context of MHC class I and II complexes. 

HCV encoding nucleic acid administered to a patient can be part of 
15 different types of vectors including viral vectors such as adenovector, and DNA 
plasmid vaccines. In different embodiments concerning administration of a DNA 
plasmid, about 0.1 to 10 mg of plasmid is administered to a patient, and about 1 to 5 
mg of plasmid is administered to a patient In different embodiments concerning 
administration of a viral vector, preferably an adenoviral vector, about 105 to 10ll 
20 viral particles are administered to a patient, and about 107 to 1010 viral particles are 
administered to a patient. 

Viral vector vaccines and DNA plasmid vaccines may be administered 
alone, or may be part of a prime and boost administration regimen. A mixed modality 
priming and booster inoculation involves either priming with a DNA vaccine and 
25 boosting with viral vector vaccine, or priming with a viral vector vaccine and boosting 
with a DNA vaccine. 

Multiple priming, for example, about to 2-4 or more may be used. The 
length of time between priming and boost may typically vary from about four months 
to a year, but other time frames may be used. The use of a priming regimen with a 
30 DNA vaccine may be preferred in situations where a person has a pre-existing anti- 
adenovirus immune response. 

In an embodiment of the present invention, IxlO 7 to IxlO 12 particles 
and preferably about IxlO 10 to IxlO 11 particles of adenovector is administered directly 
into muscle tissue. Following initial vaccination a boost is performed with an 
35 adenovector or DNA vaccine. 

34 



WO 03/031588 



PCT/US02/32512 



In another embodiment of the present invention initial vaccination is 
performed with a DNA vaccine directly into muscle tissue. Following initial 




vaccination a boost is performed with an adenovector or DNA vaccine. 

Agents such as interleukin-12, GM-CSF, B7-1, B7-2. IP10, Mig-1 cm;. ; ^; 5; » 
5 be coadministered to boost the immune response. The agents can be coadministered ; ; - 
as proteins or through use of nucleic acid vectors. 

- . . . *•■ . • i ' ■ r 

. D. Heterolo gous Prime-Boost 

Heterologous prime-boost is a mixed modality involving the use of one ^ ^ 

10 type of viral vector for priming and another type of viral vector for boosting. The 
heterologous prime-boost can involve related vectors such as vectors based on 
different adenovirus serotypes and more distantly related viruses such adenovirus and:>^ - « 
poxvirus. The use of poxvirus and adenovirus vectors to protect mice against malaria 
is illustrated by Gilbert et aU Vaccine 20:1039-1045, 2002, 

15 Different embodiments concerning priming and boosting involve the 

following types of vectors expressing desired antigens such as Met-NS3-NS4A- 
NS4B-NS5 A-NS5B: Ad5 vector followed by Ad6 vector; Ad6 vector followed by 
Ad5 vector, Ad5 vector followed by poxvirus vector, poxvirus vector followed by 
Ad5 vector, Ad6 vector followed by pox virus vector, and poxvirus vector followed by 

20 Ad6 vector. 

The length of time between priming and boosting typically varies ftofn 
about four months to a year, but other time frames may be used. The minimum time . ; ^ 
frame should be sufficient to allow for an immunological rest. In an embodiment, this 
rest is for a period of at least 6 months. Priming may involve multiple priming with 

25 one type of vector, such as 2-4 primings, 

Expression cassettes present in a poxvirus vector should contain a 
promoter either native to, or derived from, the poxvirus of interest or another poxvirus • ; 
member. Different strategies for constructing and employing : different types of 
poxvirus based vectors including those based on vaccinia virus, modified vaccinia 

30 virus, avipox virus, raccoon poxvirus, modified vaccinia virus Ankara, 

canarypoxviruses (such as ALVAC), fowlpoxviruses, cowpox viruses, and NYV AC 
are well known in the art. (Moss, Current Topics in Microbiology 'and Immunology /; 
755:25-38, 1982; Earl et a/., In Current Protocols in Molecular Biology, Ausubel ei "■■ 
al eds., New York: Greene Publishing Associates & Wiley Interscience; 

35 1991:16.16.1-16.16.7, Child et oL, Virology 174(2):625-9, 1990; Tartaglia et al.^\r.. * v/. 

35 
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Virology 188:217-232, 1992; U.S. Patent Nos., 4,603,112, 4,722,848, 4,769,330, 
5,110,587, 5,174,993, 5,185,146, 5,266,313, 5,505,941, 5,863,542, and 5,942,235. 

E. Adjuvants 

5 HCV vaccines can be formulated with an adjuvant Adjuvants are 

particularly useful for DNA plasmid vaccines. Examples of adjuvants are alum, 
AIPO4, alhydrogel, Lipid-A and derivatives or variants thereof, Freund's incomplete 
adjuvant, neutral liposomes, liposomes containing the vaccine and cytokines, non- 
ionic block copolymers, and chemokines. 

10 Non-ionic block polymers containing polyoxyethylene (POE) and 

polyxylpropylene (POP), such as POE-POP-POE block copolymers may be used as an 
adjuvant. (Newman et al., Critical Reviews in Therapeutic Drug Carrier Systems 
75:89-142, 1998.) The immune response of a nucleic acid can be enhanced using a 
non-ionic block copolymer combined with an anionic surfactant. 

15 A specific example of an adjuvant formulation is one containing CRL- 

1005 (CytRx Research Laboratories), DNA, and benzylalkonium chloride (BAK). 
The formulation can be prepared by adding pure polymer to a cold (< 5°C) solution of 
plasmid DNA in PBS using a positive displacement pipette. The solution is then 
vortexed to solubilize the polymer. After complete solubilization of the polymer a 

20 clear solution is obtained at temperatures below the cloud point of the polymer (~6- 
7°C). Approximately 4 mM BAK is then added to the DNA/CRL-1005 solution in 
PBS, by slow addition of a dilute solution of BAK dissolved in PBS. The initial DNA 
concentration is approximately 6 mg/mL before the addition of polymer and BAK, 
and the final DNA concentration is about 5 mg/mL. After BAK addition the 

25 formulation is vortexed extensively, while the temperature is allowed to increase from 
- 2°C to above the cloud point The formulation is then placed on ice to decrease the 
temperature below the cloud point. Then, the formulation is vortexed while the 
temperature is allowed to increase from ~2°C to above the cloud point. Cooling and 
mixing while the temperature is allowed to increase from ~2°C to above the cloud 

30 point is repeated several times, until the particle size of the formulation is about 200- 
500 nm, as measured by dynamic light scattering. The formulation is then stored on 
ice until the solution is clear, then placed in storage at -70°C. Before use, the 
formulation is allowed to thaw at room temperature. 

35 
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E Vaccine Storage 

Adcnovector and DNA vaccines can be stored using different types of . ';:-yt 
buffers. For example, buffer A105 described in Example 9 infra, can be used to for..: 
vector storage. 

5 Storage of DNA can be enhanced by removal or chelation of trace * f t 

metal ions. Reagents such as succinic or malic acid, and chelators can be used to 
enhance DNA vaccine stability. Examples of chelators include multiple phosphate , 
ligands and EDTA. The inclusion of non-reducing free radical scavengers, such as 
ethanol or glycerol, can also be useful to prevent damage of DNA plasmid from free 
10 radical production. Furthermore, the buffer type, pH, salt concentration, light 

exposure, as well as the type of sterilization process used to prepare the vials, may be 
controlled in the formulation to optimize the stability of the DNA vaccine. 

Vn. EXAMPLES 

15 Examples are provided below to further illustrate different features of . ' )£:p 

the present invention. The examples also illustrate useful methodology for practicing 
the invention. These examples do not limit the claimed invention. ■ 

Example 1: Met-NS3-NS4A-NS4B-NS5A-NS5B Expression Cassettes 
20 ; Different gene expression cassettes encoding HCV NS3-NS4A-NS4B'r - : 

NS5 A-NS5B were constructed based on a lb subtype HCV BK strain. The encoded 
- sequences had either (1) an active NS5B sequence ("NS"), (2) an inactive NS5B 

sequence ("NSmut"), (3) a codon optimized sequence with an inactive NS5B ; - 

sequence ("NSOPTmut")- The expression cassettes also contained a CMV 
25 promoter/enhancer and the BGH polyadenylation signal. • 

The NS nucleotide sequence (SEQ. ID. NO. 5) differs from HCV BK '.- 

strain GenBank accession number M58335 by 30 out of 5952 nucleotides. The NS * 

amino acid sequence (SEQ. EX NO. 6) differs from the corresponding lb genotype . ' 

HCV BK strain by 7 out of 1984 amino acids. To allow for initiation of translation an 
30 ATG codon is present at the 5* end of the NS sequence, A TGA termination sequence ,,/ vv 

is present at the 3* end of the NS sequence. 

The NSmut nucleotide sequence (SEQ. ID. NO. 2, Figure 2), is similar 

to the NS sequence. The differences between NSmut and NS include NSmut having 

an altered NS5B catalytic site; an optimal ribosome binding site at the 5* end; and a : 
35 TAAA termination sequence at the 3' end. The alterations in NS5B comprise bases 
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5 138 to 5146, which encode amino acids 171 1 to 1713. The alterations result in a 
change of amino acids GlyAspAsp into AlaAlaGly and creates an inactive form of the 
NS5B RNA-dependent RNA-polymerase NS5B. 

The NSOPTmut sequence (SEQ. ID. NO. 3, Figure 3) was designed 

5 based on the amino acid sequence encoded by NSmut. The NSmut amino acid 
sequence was back translated into a nucleotide sequence with the GCG (Wisconsin 
Package version 10, Genetics Computer Group, GCG, Madison, Wise.) 
B ACKTR ANSLATE program. To generate a NSOPTmut nucleotide sequence where 
each amino acid is coded for by the corresponding most frequently observed human 

10 codon, the program was run choosing as parameter the generation of the most 
probable nucleotide sequence and specifying the codon frequency table of highly 
expressed human genes (humanjugh.cod) available within the GCG Package as 
translation scheme. 

15 Example 2: Generation oVUns plasmid with NS. NS mut or NSOPTmut Sequences 
pVl Jns plasmids containing either the NS sequence, NSmut sequence 
or NSOPTmut sequences were generated and characterised as follows: 



pVUns Plasmid with the NS Sequence 

20 The coding region Met-NS3-NS4A-NS4B-NS5A and the coding 

region Met-NS3-NS4A-NS4B-NS5A-NS5B from a HCV BK type strain (Tomei et 
al t J. Virol (57:4017-4026, 1993) were cloned into pcDNA3 plasmid (Invitrogen), 
generating pcD3-5a and pcD3-5b vectors, respectively. PcD3-5A was digested with 
Hind HI, blunt-ended with Klenow fill-in and subsequently digested with Xba I, to 

25 generate a fragment corresponding to the coding region of Met-NS3-NS4A-NS4B- 
NS5A. The fragment was cloned into pVUns-poly, digested with Bgl II blunt-ended 
with Klenow fill-in and subsequently digested with Xba I, generating pVUnsNS3-5A. 

pVUns-poly is a derivative of pVUnsA plasmid (Montgomery et aL, 
DMA and Cell Biol 12:111-1%% 1993), modified by insertion of a polylinker 

30 containing recognition sites for Xbal, Pmel, PacI into the unique BgUI and NotI 
restriction sites. The pVl Jns plasmid with the NS sequence (pVUnsNS3-5B) was 
obtained by homologous recombination into the bacterial strain BJ5183, co- 
transforming pVlJNS3-5A linearized with Xbal and Nod digestion and a PCR 
fragment containing approximately 200 bp of NS5A, NS5B coding sequence and 
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approximately 60 bp of the BGH polyadenylation signal. The resulting plasmid 
represents pVUns-NS. 

pV Uns-NS can be summarized as follows: 
Bases 1 to 1881 of pVUnsA 

5 an additional AGCTT 

then the Met-NS3-NS5B sequence (SEQ. ID. NO. 5) . 

then the wt TGA stop 

an additional TCTAGAGCGTlTAAACCCTTAATTAAGG (SEQ. ID. 

NO. 14) 

10 Bases 1912 to 4909 of pVUnsA 

pVUns Plasmid with the NSmut Sequence 

The VI JnsNS3-5 A plasmid was modified at the 5' of the NS3 coding 
sequence by addition of a full Kozak sequence. The plasmid (VlJNS3-5Akbzak) was 

15 obtained by homologous recombination into the bacterial strain BI5 183, co- 
transforming VI JNS3-5A linearized by AfWL digestion and a PCR fragment containing 
the proximal part of Intron A, the restriction site BglQ, a full Kozak translation 
initiation sequence and part of the NS3 coding sequence. 

The resulting plasmid (VI JNS3-5Akozak) was linearized with Xba I : 

20 digestion and co-transformed into the bacterial strain BJ5183 with a PCR fragment'; ' 
containing approximately 200 bp of NS5A, the NS5B mutated sequence, the strong 
translation termination TAAA and approximately 60 bp of the BGH polyadenylation 
signal. The PCR fragment was obtained by assembling two 22bp-overlapping 
fragments where mutations were introduced by the oligonucleotides used for their 

25 amplification. The resulting plasmid represents pVl Jns-NSmut. 

pVl Jns-NSmut can be summarized as follows: 
Bases 1 to 1882 of pVUnsA 

then the kozak Met-NS3-NS5B(miit) TAAA sequence (SEQ, ID. NO. 2) 
an additional TCTAGA ; 
30 Bases 1925 to 4909 of pVUnsA. 

pVlJns Plasmid with the NSOPTmut Sequence 

The human codon-optimized synthetic gene (NSOPTmut) with 
mutated NS5B to abrogate enzymatic activity, full Kozak translation initiation 
35 sequence and a strong translation termination was digested with BamHt and Sail 
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restriction sites present at the 5' and 3' end of the gene. The gene was then cloned 
into the Bgin and Sail restriction sites present in the polylinker of pVUnsA plasmid, 
generating pVlJns-NSOPTmut 

pV 1 Jns-NSOPTmut can be summarized as follows: 
5 Bases 1 to 1881 of pVUnsA 

an additional C 

then kozak Met-NS3-NS5B(optmut) TAAA sequence (SEQ. ID. NO. 3) 

an additional TTTAAATGTTTAAAC (SEQ. ID. NO. 15) 
Bases 1905 to 4909 of pVUnsA 

10 

Plasmids Characterization 

Expression of HC V NS proteins was tested by transfection of HEK 
293 cells, grown in 10% FCS/DMEM supplemented by L-glutamine (final 4 mM). 
Twenty-four hours before transfection, cells were plated in 6-well 35 mm diameter, to 

15 reach 90-95% confluence on the day of transfection. Forty nanograms of plasmid 

DNA (previously assessed as a non-saturating DNA amount) were co-transfected with 
100 ng of pRSV-Luc plasmid containing the luciferase reporter gene under the control 
of Rous sarcoma virus promoter, using the LIPOFECTAMINE 2000 reagent. Cells 
were kept in a C0 2 incubator for 48 hours at 37 °C. 

20 Cell extracts were prepared in 1% Triton/TEN buffer. The extracts 

were normalized for Luciferase activity, and run in serial dilution on 10% SDS- 
acrylamide gel. Proteins were transferred on nitrocellulose and assayed with 
antibodies directed against NS3, NS5A and NS5B to assess strength of expression and 
correct proteolytic cleavage. Mock-transfected cells were used as a negative control. 

25 Results from representative experiments testing pVl JnsNS, pVl JnsNSmut and 
pVl JnsNSOPTmut are shown in Figure 12. 

Example 3: Mice Immunization with Plasmid DNA Vectors 

The DNA plasmids pVUns-NS, pVUns-NSmut and pVUns- 
30 NSOPTmut were injected in different mice strains to evaluate their potential to elicit 
anti-HCV immune responses. Two different strains (Balb/C and C57Black6, N=9-10) 
were injected intramuscularly with 25 or 50 \ig of DNA followed by electrical pluses. 
Each animal received two doses at three weeks interval. 

Humoral immune response elicited in C57Black6 mice against the NS3 
35 protein was measured in post dose two sera by ELISA on bacterially expressed NS3 
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protease domain. Antibodies specific for the tested antigen were detected in animals ; 
immunized with all three vectors with geometric mean titers (GMT) ranging from 
94000 to 133000 (Tables 1-3). 



Table 1: oVlins-NS 





GMT 


Mice 
n. 


I 


2 


3 


4 


5 


6 


7 


8 


9 




Titer 


105466 


891980 


78199 


39496 


543542 


182139 


32351 


95028 


67800 


94553? 



10 



Table 2: pVljns-NSmiit 





GMT 


Mice 
n. 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 




Titer 


202981 


55670 


130786 


49748 


17672 


174958 


44304 


37337 


78182 


193695 


75d83'' 



Table 3: pVljns-NSOPTmut 





GMT 


Mice 

n.- 




22 


23 


24 


25 


26 


27 


28 


29 


30 




Titer 


310349 


43645 


63496 


82174 


630778 


297259 


66861 


146735 


173506 


77732 


133165? 



15 ■ * 

A T cell response was measured in C57Black6 mice immunized witfi 

two intramuscular injections at three weeks interval with 25 fig of plasmid DNA. 

Quantitative EUspot assay was performed to determine the number of IFNy secreting 

T cells in response to five pools of 20mer peptides overlapping by ten residues * - ; v s >* 
20 encompassing the NS3-NS5B sequence. Specific CD8+ response was analyzed by the i 

same assay using a 20mer peptide encompassing a CD8+ epitope for G57Black6 mice • . 

(i>epi480). 

Cells secreting EFNy "in an antigen specific-manner were detected using '\ 
a standard ELIspot assay. T ceil response in C57Black6 mice immunized with two 
25 intramuscular injections at three weeks interval with 50 \ig of plasmid DNA, was 
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analyzed by the same ELIspot assay measuring the number of IFNy secreting T cells . 
in response to five pools of 20mer peptides overlapping by ten residues encompassing 
the NS3-NS5B sequence. 

Spleen cells were prepared from immunized mice and re-suspended in 
5 RIO medium (RPMI 1640 supplemented with 10% FCS, 2 raM L-Glutamine, 50 
U/m!-50fig/ml Penicillin/Streptomycin, 10 mM Hepes, 50 /*M 2-mercapto-ethanol). 
Multiscreen 96-well Filtration Plates (Millipore, Cat, No. MAIPS45 10, Millipore 
Corporation, 80 Ashby Road Bedford, MA) were coated with purified rat anti-mouse 
INFy antibody (PharMingen, Cat. No. 18 18 ID, PharmiMingen, 10975 Torreyana 
10 Road, San Diego, California 9212 1-1111 USA). After overnight incubation, plates 
were washed with PBS lX/0.005% Tween and blocked with 250 jil/well of R10 
medium. 

Splenocytes from immunized mice were prepared and incubated for 
twenty-four hours in the presence or absence of 10 fiM peptide at a density of 2.5 X 

15 lOVwell or 5 X 10 $ /well. After extensive washing (PBS lX/0.005% Tween), 
biotinylated rat anti-mouse EFNy antibody (PharMingen, Cat. No. 181 12D, 
PharMingen, 10975 Torreyana Road, San Diego, California 92121-1 1 1 1 USA) was 
added and incubated overnight at 4° C. For development, streptavidin- AKP 
(PharMingen, Cat No. 13043E, PharMingen, 10975 Torreyana Road, San Diego, 

20 California 92121-1111 USA) and 1-Step™ NBT-BCIP development solution (Pierce, 
Cat. No. 34042, Pierce, P.O. Box 117, Rockf ord, IL 61 105 USA) were added. 

Pools of 20mer overlapping peptides encompassing the entire sequence 
of the HCV BK strain NS3 to NS5B were used to reveal HCV-specific IFNy-secreting 
T cells, Similarly a single 20mer peptide encompassing a CD8+ epitope for 

25 C57Black6 mice was used to detect CD8 response. Representative data from groups 
of C57Black6 and Balb/C mice (N=9-10) immunized with two injections of 25 or 50 
\ig of plasmid vectors pVl Jns-NS, pVlJns-NSrnut and pVUns-NSOPTmut are shown 
in Figures 13A and 13B. 

30 Example 4: Immunization of Rhesus Macaques 

Rhesus macaques (N=3) were immunized by intramuscular injection 
with 5rng of plasmid pVUns-NSOPTmut in 7.5mg/ml CRL1005, Benzalkonium 
chloride 0.6 mM. Each animal received two doses in the deltoid muscle at 0, and 4 
weeks. 
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CM1 was measured at different time points by IFN-y ELISPOT. This 
assay measures HCV antigen-specific CD8+ and CD4+ T lymphocyte responses, and ■ ;.- 1 ^ 
can be used for a variety of mammals, such as humans, rhesus monkeys, mice, and \ 
rats. 

5 . The use of a specific peptide or a pool of peptides can simplify antigen ■' • *z 

presentation in CTL cytotoxicity assays, interferon-gamma ELISPOT assays and 
interferon-gamma intracellular staining assays ■ Peptides based on the amino acid ; , ^ " 
sequence of various HCV proteins (core, E2, NS3, NS4A, NS4B, NS5 A, NS5B) were; 'f :; 
prepared for use in these assays to measure immune responses in HCV DNA and 
10 adenovirus vector vaccinated rhesus monkeys, as well as in HCV-infected humans. 
The individual peptides are overlapping 20-mers, offset by 10 amino acids. Large 
pools of peptides can be used to detect an overall response to HCV proteins while - , ' ' 
smaller pools and individual peptides may be used to define the epitope specificity of 
a response. 

15 .\ 
IFNy ELISPOT 

The IFNy-ELISPOT assay provides a quantitative determination of 
HCV-specific T lymphocyte responses. PBMC are serially diluted and placed in 
microplate wells coated with anti-rhesus IFN-y antibody (MD-1 U-Cytech). They are V, 

20 cultured with a HCV peptide pool for 20 hours, resulting in the restimulation of the , ';. 
precursor cells and secretion of IFN-y. Th e cells are washed away, leaving the 
secreted IFN bound to the antibody-coated wells in concentrated areas where the cells 
were sitting. The captured IFN is detected with biotinylated anti-rhesus IFN antibody, 
(detector Ab U-Cytech) followed by alkaline phosphatase-conjugated streptavidin 

25 (Pharmingen 13043E). The addition of insoluble alkaline phosphatase substrate 

results in dark spots in the wells at the sites where the cells were located, leaving one • 
spot for each T cell that secreted IFN-y. t 

The number of spots per well is directly related to the precursor 
frequency of antigen-specific T cells. Gamma interferon was selected as the cytokirie 

30 visualized in this assay (using species specific anti -gamma interferon monoclonal 

antibodies) because it is the most common, and one of the most abundant cytokines ; u 
synthesized and secreted by activated T lymphocytes. For this assay, the number of 
spot forming cells (SFC) per million PBMCs is determined for samples in the 
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presence and absence (media control) of peptide antigens. Data from Rhesus 
macaques on PBMC from post dose two material are shown in Table 4. 



Table 4 





PVlJ-NSOPTmut 


Pep pools 


21G 


99C161 


99C166 


F (NS3p) 


8 


10 


170 


G (NS3h) 


7 


592 


229 ! 


H(NS4) 


3 


14 


16 


I (NS5a) 


5 


71 


36 


L (NS5b) 


14 


23 


11 


M (NS5b) 


3 


35 


8 


DMSO 


2 


4 


5 



INFyELISPOT on PBMC from Rhesus monkeys immunized with two injections of 5 
5 mg DNA/dose in OPTTVAX/BAK of plasmid pVl Jns-NSOPTmut Data are 



expressed as SFC7 10$ PBMC. 

Example 5: Construction of Ad6 Pre-Adenovirus Plasmids 

Ad6 pre-adenovirus plasmids were obtained as follows: 

10 

Construction ofpAd6 E1-E3+ Pre-adenovirus Plasmid 

An Ad6 based pre-adenovirus plasmid which can be used to generate 
first generation Ad6 vectors was constructed either taking advantage of the extensive 
sequence identity (approx. 98%) between Ad5 and Ad6 or containing only Ad6 

15 regions. Homologous recombination was used to clone wtAd6 sequences into a 
bacterial plasmid. 

A general strategy used to recover pAd6El-E3+ as a bacterial plasmid 
containing Ad5 and Ad6 regions is illustrated in Figure 10. Cotransformation of BJ 
5183 bacteria with purified wt Ad6 viral DNA and a second DNA fragment termed 

20 the Ad5 ITR cassette resulted in the circularization of the viral genome by 

homologous recombination. The ITR cassette contains sequences from the right (bp 
33798 to 35935) and left (bp 1 to 341 and bp 3525 to 5767) end of the Ad5 genome 
separated by plasmid sequences containing a bacterial origin of replication and an 
ampicillin resistance gene. The ITR cassette contains a deletion of El sequences from 
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Ad5 342 to 3524. The Ad5 sequences in the TTR cassette provide regions of 
homology with the purified Ad6 viral DNA in which recombination can occur. 

Potential clones were screened by restriction analysis and one clone 
was selected as pAd6El-E3+. This clone was then sequenced in it entirety. pAd6Ei- 
5 E3+ contains Ad5 sequences from bp 1 to 341 and from bp 3525 to 5548, Ad6 bp • 
5542 to 33784, and Ad5 bp 33967 to 35935 (bp numbers refer to the wt sequence for 
both Ad5 and Ad6). pAd6El-E3+ contains the coding sequences for all Ad6 virion ; 
structural proteins which constitute its serotype specificity. 

A general strategy used to recover pAd6El-E3+ as a bacterial plasmid 
10 containing Ad6 regions is illustrated in Figure 1 1- ^transformation of BJ 5 183 

bacteria with purified wt Ad6 viral DNA and a second DNA fragment termed the Ad6 
TTR cassette resulted in the circularization of the viral genome by homologous 
recombination. The ITR cassette contains sequences from the right (bp 35460 to : 
35759) and left (bp 1 to 450 and bp 3508 to 3807) end of the Ad6 genome separated 
15 by plasmid sequences containing a bacterial origin of replication and an ampicillin 
resistance gene. These three segments were generated by PCR and cloned 
sequentially into pNEB 193, generating pNEB Ad6-3 (the ITR cassette). The ITR 
cassette contains a deletion of El sequences from Ad5 451 to 3507. The Ad6 
sequences in the ITR cassette provide regions of homology with the purified Ad6 viral 
20 D>NA in which recombination can occur. 

Construction ofpAdd El -E3- pre-adenovirus plasmids 

Ad6 based vectors containing A5 regions and deleted in the E3 region ^ 
were constructed starting with pAd6El-E3+ containing Ad5 regions. A 5322 bp 

25 subfragirient of pAd6El-E3+ containing the E3 region (Ad6 bp 25871 to 31192) was 
subcloned into pABS.3 generating pABSAd6E3, Thriee E3 deletions were then made 
in this plasmid generating three new plasmids pABS Ad6E3(l .8Kb) (deleted for Ad6 
bp 28602 to 30440), pABSAd6E3(2.3Kb) (deleted for Ad6 bp 28157 to 30437) and v 
pABSAd6E3(2.6Kb) (deleted for Ad6 bp 28157 to 30788). Bacterial recombination 

30 was then used to substitute the three E3 deletions back into pAd6El-E3+ generating 
the Ad6 gehoihe plasmids pAd6El-E3-1.8Kb, pAd6El-E3-2.3Kb ahd pAd6El-E3^ 
2.6Kb. 
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Example 6: Generation of Ad5 Genome Plasmid with foe NS Sequence 

A pcDNA3 plasmid (Invitrogen) containing the coding region NS3-NS4A- 
NS4B-NS5A was digested with Xmnl and Nrul restriction sites and the DNA fragment 
containing the CMV promoter, the NS3-NS4A-NS4B-NS5A coding sequence and the 
Bovine Growth Hormone (BGH) polyadenylation signal was cloned into the unique EcotV 
5 restriction site of the shuttle vector pDelElSpa, generating the Sva3-5A vector. 

A pcDNA3 plasmid containing the coding region NS3-NS4A-NS4B- 
NS5A-NS5B was digested with Xmnl and Ecorl (partial digestion), and the DNA 
fragment containing part of NS5A, NS5B gene and the BGH polyadenylation signal 
was cloned into the Sva3-5A vector, digested Ecorl and BgRl blunted with Klenow, 
10 generating the Sva3-5B vector. 

The S va3-5B vector was finally digested Sspl and Bstl 1071 restriction 
sites and the DNA fragment containing the expression cassette (CMV promoter, NS3- 
NS4A-NS4B-NS5 A-NS 5B coding sequence and the BGH polyadenylation signal) 
flanked by adenovirus sequences was co-transformed with pAdSHVO (E1-JE3-) Clal 
15 linearized genome plasmid into the bacterial strain BJ5183, to generate pAdSHVONS. 
pAdSHVO contains Ad5 bp 1 to 341, bp 3525 to 28133 and bp 30818 to 35935. 

Example 7: Generation of Adenovirus Genome Plasmids with the NSmut Sequence 

Adenovirus genome plasmids containing an NS-mut sequence were 
generated in an Ad5 or Ad6 background. The Ad6 background contained Ad5 regions . 
20 at bases 1 to 450, 3511 to 5548 and 33967 to 35935. 

pVl JNS3-5Akozak was digested with BglH and Xbal restriction 
enzymes and the DNA fragment containing the Kozak sequence and the sequence 
coding NS3-NS4A-NS4B-NS5A was cloned into a BgtR and Xbal digested 
polypMRKpdelEl shuttle vector. The resulting vector was designated shNS3- 
25 SAkozak. 

PolypMRKpdelEl is a derivative of RKpdelEl(Pac/pD£/pack450) + 
CMVmin+BGHpA(str.) modified by the insertion of a polylinker containing 
recognition sites for BgUI, Pmel, Swal, Xbal, Sail, into the unique Bglfl restriction 
site present downstream the CMV promoter. MRKpdeIEl(Pac/pDC/pack450) + 
30 CMVmin + BGHpA(str.) contains Ad5 sequences from bp 1 to 5792 with a deletion 
of El sequences from bp 451 to 3510. The human CMV promoter and BGH 
polyadenylation signal were inserted into the El deletion in an El parallel orientation, 
with a unique Bglll site separating them. 
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The NS5B fragment, mutated to abrogate enzymatic activity and with a ; 
strong translation termination at the 3* end, was obtained by assembly PGR and 
inserted into the shNS3-5Akozak vector via homologous recombination^ generating 
polypMRKpdelElNSmut. In polypMRKpdelElNSmut the NS-mut coding sequence, • 
5 is under the control of CMV promoter and the BGH polyadenylation signal is present* 7 
downstream* 

The gene expression cassette and the flanking regions which contain 
adenovirus sequences allowing homologous recombination were excised by digestion' 
with Pad and Bstl 1071 restriction enzymes and co-transformed with either 
10 pAdSHVO (E1-,E3-) or pAd6El-E3-2.6Kb Clal linearized genome plasmids into the 
bacterial strain BJ5 1 83, to generate pAdSHVONSmut and pAd6El-,E3-NSniut, 
respectively. 

pAd6El-E3-2.6Kb contains Ad5 bp 1 to 341 and from bp 3525 to 
5548, Ad6 bp 5542 to 28157 and from bp 30788 to 33784, and Ad5 bp 33967 to 
15 35935 (bp numbers refer to the wt sequence for both Ad5 and Ad6). In both plasmids 
the viral ITR's are joined by plasmid sequences that contain the bacterial origin of . 
replication and an ampicillin resistance gene. 

Example 8: Generation of Adenovirus Genome Plasmids with the NSOPTmut 
' - • The human codon-optimized synthetic gene (NSOPTmut) provided l>y 

20 SEQ. ID. NO. 3 cloned into a pCRBlunt vector (Invitrogen) was digested with BaihHl 
and Sail restriction enzymes and cloned into Bglli and Sail restriction sites present ixi 
the shuttle vector polypMRKpdelEl. The resulting clone ^ 
(polypMRKpdelElNSOPTmut) was digested with Pad and Bstl 1071 restriction '■ 
enzymes and co-transformed with either p AdSHVO (E1-,E3-) or pAd6El-E3-2.6Kb 

25 Cial linearized genome plasmids, into the bacterial strain BJ5183, to generate T 
pAdSHVONSOPTmui arid pAd6El-,E3-NSOPTmut, respectively. 

Example 9: Rescue and Amplification of Adenovirus Vectors 

Adenovectbrs were rescued in Per.6 cells. Per.G6 were grown in 10% 
30 FCS / DMEM supplemented by L-glutamine (final 4mM), penicillin/streptomycin \ \ 
(final 100 IU/ml) and 10 mM MgGl2- After infection, cells were kept in the same 
medium supplemented by 5% horse serum (HS). For viral rescue, 2.5 X 10 6 ?gt.C6 :< 
were plated in 6 cm 0 Petri dishes. 
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Twenty-four hours after plating, cells were transfected by calcium 
phosphate method with 10 fig of the Pac I linearized adenoviral DNA. The DNA 
precipitate was left on the cells for 4 hours. The medium was removed and 5% 
HS/DMEM was added. 
5 Cells were kept in a C0 2 incubator until a cytopathic effect was visible 

(1 week). Cells and supernatant were recovered and subjected to 3X freeze/thawing 
cycles (liquid nitrogen / water bath at 37°C). The lysate was centrifuged at 3000 rpm 
at - 4°C for 20 minutes and the recovered supernatant (corresponding to a cell lysate 
containing virus passed on cells only once; PI) was used, in the amount of 1 ml/ dish, 

10 to infect 80-90% confluent Per.C6 in 10 cm 0 Petri dishes. The infected cells were 
incubated until a cytopathic effect was visible, cells and supernatant recovered and the 
lysate prepared as described above (P2). 

P2 lysate (4 ml) were used to infect 2 X 15 cm 0 Petri dishes. The 
lysate recovered from this infection (P3) was kept in aliquots at -80°C as a stock of 

15 virus to be used as starting point for big viral preparations. In this case, 1 ml of the 

stock was enough to infect 2X15 cm 0 Petri dishes and resulting lysate (P4) was used 
for the infection of the Petri dishes devoted to the large scale infection. 

Further amplification was obtained from the P4 lysate which was 
diluted in medium without FCS and used to infect 30 X 15 cm 0 Petri dishes (with 

20 Per.C6 80%-90% confluent) in the amount of 10 ml/dish. Cells were incubated 1 
hour in the CO2 incubator, mixing gently every 20 minutes. 12 ml / dish of 5% HS / 
DMEM was added and cells were incubated until a cytopathic effect was visible 
(about 48 hours). 

Cells and supernatant were collected and centrifuged at 2K rpm for 20 
25 minutes at 4°C. The pellet was resuspended in 15 ml of 0.1 M Tris pH=8.0. Cells 
were lysed by 3X freeze/thawing cycles (liquid nitrogen / water bath at 37°C). 150 /tl 
of 2 M MgCl2 and 75 /il of DNAse (10 mg of bovine pancreatic deoxyribonuclease I 

in 10 ml of 20 mM Tris-HCl pH= 7.4, 50 mM NaCl, 1 mM dithiothreitol, 0.1 mg/ml 
bovine serum albumin, 50% glycerol) were added. After a 1 hour incubation at 37°C 
30 in a water bath (vortex every 15 minutes) the lysate was centrifuged at 4K rpm for 15 
minutes at 4°C. The recovered supernatant was ready to be applied on CsCl gradient. 
The CsCl gradients were prepared in SW40 ultra-clear tubes as 

follows: 

0.5 ml of 1.5d CsCl 
35 3 ml of 1.35dCsCl 
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3 ml of 1.25dCsCl 

5-ml/ tube of viral supernatant was applied 

If necessary, the tubes were topped up with 0.1 M tris-Cl pH=8.0. 
Tubes were centrifuged at 35K rpm for 1 hour at -10°C with rotor SW40. The viral . 
5 bands (located at the 1 .25/1.35 interface) were collected using a syringe. 

The vims was transferred into a new SW40 ultraclear tube and 1.35d' ; 
CsCl was added to top the tube up. After centrifiigation at 35K rpm for 24 hours at : 
10°C in the rotor SW40, the virus was collected in the smallest possible volume and * 
dialyzed extensively against buffer A105 (5 mM Tris, 5% sucrose, 75 mM NaCl, 1 
10 mM MgCl2, 0.005% polysorbate 80 pH=8.0), After dialysis, glycerol was added to 

final 10% and the virus was stored in aliquots at - 80°C. 

Example 10: Enhanced Adenovector Rescue . .. < 

First generation Ad5 and Ad6 vectors carrying HCV NSOPTmut 

15 transgene were found to be difficult to rescue* A possible block irk the rescue process 
might be attributed to an inefficient replication of plasmid DNA that is a sub-optimal 
template for the replication machinery of adenovirus* The absence of the terminal 
protein linked to the 5'ends of the DNA (normally present in the viral DNA), 
associated with the very high G-C content of the transgene inserted in the El region of 

20 the vector, may be causing a substantial reduction in replication rate of the plasmid-. 
derived adenovirus. . .. ; 

To set up a more efficient and reproducible procedure for rescuing Ad 
vectors* an expression vector (pE2; Figure 19) containing all E2 proteins (polymery, 
pre-terminal protein and DNA binding protein) as well as E4 orf6 under the control 'of 

25 tet-inducible promoter was employed. The trarisfection of pE2 in combination with a 

- . "i • 

normal preadeno plasmid in PerC6 and in 293 leads to a strong increase of Ad DNA- 
replication arid to a more efficient production of complete infectious adenovirus 
particles. 

30 Plasmid Construction 

pE2 is based on the cloning vector pBI (CLONTECH) with the ; 1 
addition of two elements to allow episomal replication and selection in cell culture: 
(1) the EBV-OriP (EBV [nt] 7421-8042) region permitting plasmid replication in 
synchrony with the cell cycle when EBNA-1 is expressed and (2) the hygromycin-B 

35 phosphotransferase (HPH)-resi stance gene allowing a positive selection of 
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transformed cells. The two transcriptional units for the adenoviral genes E2 a and b 
and E4-Orf6 were constructed and assembled in pE2 as described below. 

The Ad5-Polymerase Clal/SphI fragment and the Ad5-pTP 
Acc65/EcoRV fragment were obtained from pVac-Pol and pVac-pTP (Stunnemberg et 
5 al. NAR 76:243 1-2444, 1988). Both fragments were filled with Klenow and cloned 
into the Sail (filled) and EcoRV sites of pBI, respectively obtaining pBI-Pol/pTP. 

EBV-OriP element from pCEP4 (Invitrogen) was first inserted within 
two chicken P-globin insulator dimers by cloning it into BamHI site of pJC13-l 
(Chung et al, Cell 74(3):5Q5-i4, 1993). HS4-OriP fragment from pJC13-OriP was 
10 then cloned inside pSAlmv (a plasmid containing tk-Hygro-B resistance gene 

expression cassette as well as Ad5 replication origin), the ITR's arranged as head-to- 
tail junction, obtained by PCR from pFG140 (Graham, EMBO 7. 5:2917-2922, 1984) 
using the following primers: 5 ' -TCG AATCG AT ACGCG AACCT ACGC-3 9 (SEQ. 
ID. NO. 16) and 5'-TCGACGTGTCGACTTCGAAGCGCACACCAAAAACGTC-3' 
15 (SEQ. ID. NO. 17), thus generating pMVHS40rip. A DNA fragment from 

pMVHS40rip, containing the insulated OriP, Ad5 ITR junction and tk-HygroB 
cassette, was then inserted into pBI-Pol/pTP vector restricted AseUAatll generating 
P BI-Pol/pTPHS4 . 

To construct the second transcriptional unit expressing Ad5-Orf6 as 
20 well as Ad5-DBP, E4orf6 (Ad 5 [nt] 33193-34077) obtained by PCR was first 
inserted into pBI vector, generating pBI-Orf6. Subsequently, DBP coding DNA 
sequence (Ad 5 [nt] 22443-24032) was inserted into pBI-Orf6 obtaining the second 
bi-directional Tet-regulated expression vector (pBI-DBP/E4orf6). The original polyA 
signals present in pBI were substituted with BGH and SV40 polyA. 
25 pBI-DBP/E4orf6 was then modified by inserting a DNA fragment 

containing the Adeno5-ITRs arranged in head-to-tail junction plus the hygromicin B 
resistance gene obtained from plasmid pS A- lmv. The new plasmid pBI- 
DBP/E4orf6shuttle was then used as donor plasmid to insert the second tet-regulated 
transcriptional unit into pBI-Pol/pTPHS4 by homologous recombination using £. coli 
30 strain BJ5 1 83 obtaining pE2. 

Cell lines, Transfections and Virus Amplification 

PerC6 cells were cultured in Dulbecco's modified Eagle's Medium 
(DMEM) plus 10% fetal bovine serum (FBS), 10 mM MgCl 2 , penicillin (100 U/ml), 
35 streptomycin (100 jig/ml) and 2 mM glutamine. 
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- . \ All transient transfectioris were performed using Iipofectamine2000. 

(Invitrogen) as described by the manufacturer. 90% confluent PERC.6™ planted iri 
6-cm plates were transfected with 3.5 jig of Ad5/6NSOPTmut pre-adeno plasmids, 
digested with Pari, alone or in combination with 5 jig pE2 plus 1 jig pUHD52. 1. 
5 pUHD52. 1 is the expression vector for the reverse tet transacti vator 2 (rtTA2) 
(Uriinger et all, Proc. Natl Acad ScL 1/.S.A 97(74*7963-7968, 2000). Upon 
: transfection, cells were cultivated in the presence of 1 jig/ml of doxyeycline to 

- activate pE2 expression. 7 days post-transfection cells were harvested and cell lysate 
was obtained by three cycles of freeze-thaw. Two ml of cell lysate were used to infect 
10 a second 6-cm dish of PeiC6. Infected cells were cultivated until a full CPE was 
observed then harvested. The virus was serially passaged five times as described 
above, then purified on CsCl gradient. The DNA structure of the purified virus was 
controlled by endonuclease digestion and agarose gel electrophoresis analysis and : 
compared to the original pre-adeno plasmid restriction pattern. 

15 * * ' ' 

Example 11: Partial Optimizeation of HCV Polvprotein Encoding Nucleic acid 

Partial optimization of HCV polyprotein encoding nucleic acid was j 

performed to facilitate the production of adenovectors containing codons optimized' . 

fot expression in a human host. The overall objective was to provide for increased . 
20 expression due to codon optimization, while facilitating the production of an 

adenovector encoding HCV polyprotein. 

Several difficulties were encountered in producing an adenovector 

Encoding HCV polyprotein with codons optimized for expression in a human host. An 

adenovector containing an optimized sequence (SEQ. ID. NO. 3) was found to be 
25 more difficult to synthesize and rescue than an adenovector containing a hdn- 

optimized sequence (SEQ. ID. NO. 2). 

The difficulties in producing an adenovector containing SEQ. ID, NO. 

3 were attributed to a high GC content. A particularly probleinetic region was the * 

region at about position 3900 of NSOFTmut (SEQ. ID. NO. 3). 
30 Alternative versions of optimized HCV encoding nucleic acid 

sequence were designed to facilitate its use in an adenovector. The alternative 
. versions, compared to NSOFTmut, were designed to have a lower overall GC content, 

to reduce/avoid the presence of potentially problematic motifis of consecutive G's or; 

Cs* while maintaining a high level of codon optimization to allow improved 
35 expression of the encoded polyprotein and the individual cleavage products. 
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A starting point for the generation of a suboptimally codon-optimized 
sequence is the coding region of the NSOPTmut nucleotide sequence (bases 7 to 5961 
of SEQ. ID. NO. 3). Values for codon usage frequencies (normalized to a total of 1.0 
for each amino acid) were taken from the file humanjiigh.cod available in the 
5 Wisconsin Package Version 10.3 (Accelrys Inc., a wholly owned subsidiary of 
Pharmacopeia, Inc). 

To reduce the local and overall GC content a table defining preferred 
codon substitutions for each amino acid was manually generated. For each amino acid 
the codon having 1) a lower GC content as compared to the most frequent codon and 
10 2) a relativly high observed codon usage frequency (as defined in humanjiigh.cod) 
was chposen as the replacement codon. For example for Arg the codon with the 
highest frequency is CGC. Out of the other five alternative codons encoding Arg 
(CGG, AGG, AGA, CGT, CGA) three (AGG, CGT, CGA) reduce the GC content by 
1 base, one (AGA) by two bases and one (CGG) by 0 bases. Since the AGA codon is 
15 listed in humanjiigh.cod as having a relatively low usage frequency (0. 1), the codon 
substituting CGC was therefore choosen to be AGG with a relative frequency of 0.18. 
Similar criteria were applied in order to establish codon replacements for the other 
amino acids resulting in the list shown in Table 5. Parameters applied in the following 
optimization procedure were determined empirically such that the resulting sequence 
20 maintained a considerably improved codon usage (for each amino acid) and the GC 
content (overall and in form of local stretches of consecutive G's and/or C's) was 
decreased. 

Two examples of partial optimized HCV encoding sequences are 
provided by SEQ. ID. NO. 10 and SEQ. ID. NO. 1 1. SEQ. ID. NO. 10 provides a 
25 HCV encoding sequence that is partially optimized throughout. SEQ. ID. NO. 11 
provides an HCV encoding sequence fully optimized for codon usage with the 
exception of a region that was partially optimized. 

Codon optimization was performed using the following procedure: 
Step 1) The coding region of the input fully optimized NSOPTmut 
30 sequence was analyzed using a sliding window of 3 codons (9 bases) shifting the 
window by one codon after each cycle. Whenever a stretch containing 5 or more 
consecutive C's and/or G's was detected in the window the following replacement rule 
was applied: Let N indicate the number of codon replacements previously performed. 
If N is odd replace the middle codon in the window with the codon specified in Table 
35 5, if N is even replace the third terminal codon in the window with the codon 
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specified in a codon optimization table such as humarijiigh.cod. If Leu or Val is 
present at the second or third codon do not apply any replacement in order not to 
introduce Leu or Val codons with very low relative codon usage frequency (see, for \ 
example, humanjiigh.cod). In the following cycle analysis of the shifted window was 

5 then applied to a sequence containing the replacements of the previous cycle. \ 
The alternating replacement of the middle and terminal codon in the 3 
codon window was found empirically to give a more satisfying overall maintenance of 
optimized codon usage while also reducing GC content (as judged from the final 
sequence after the procedure). In general, however, the precise replacement strategy 

10 depends on the amino acid sequence encoded by the nucelotide sequence under 
analysis arid wiil have to be determined empirically. 

Step 2) The sequence containing all the codon replacements performed 
during step 1) was then subjected to an additional analysis using a sliding window of ; 
21 codoris (63 bases) in length: according to an adjustable parameter the overall GC 

15 content in the window was determined. If the GC content iii the window was higher 
than 70% the following codon replacement strategy was applied: In the window 
replace the codoris for the amino acids Asri, Asp* Cys, Glu, His, lie, Lys, Phe, Tyr by . 
the codons given in Table 5. Restriction of the replacement to this s&i of amino acids 
was motivated by the fact that a) the replacement codori still has an accetably high 

20 ! frequency of usage in humari_high<cod and b) the average overall humaii codon usage 
in CUTG for the replacement codon is nearly as high as the most frequent codon. Li ^ 
the following cycle analysis of the shifted window is then applied to a sequence 
containing the replacements of the previous cycle. 

The threshold 70% was determined empirically by compromising 

25 between an overall reduction in GC content and maintenance of a high codon 

optimization for the individual amino acids. As in step 1) the precise replacement 
strategy (choice of amino acids and GC content threshold value) will again depend on 
the amino acid sequence encoded by the nucleotide sequence under analysis arid will "v* 
/ : have to be determined empirically. 

30 Step 3) The sequence generated by steps 1) and 2) was then manually 

edited arid additional codons were changed according to the following criteria: "* 
Regions still having a GC content higher than 70% over a window of 21 codpris were 
examined manually arid a few codons were replaced again following the scheme given 
in Table 5. 
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Subsequent steps were performed to provide for useful restriction sites, 
remove possible open reading frames on the complementary strand, to add 
homologous recombinant regions, to add a Kozac signal, and to add a terminator. 
These steps are numbered 4-7 

5 Step 4) The sequence generated in step 3 was examined for the absence 

of certain restriction sites (BglH, Pmel and Xbal) and presence of only 1 StuI site to 
allow a subsequent cloning strategy using a subset of restriction enzymes. Two sites 
(one for Bgill and one for StuI) were removed from the sequence by replacing codons 
that were part of the respective recognition sites. 

10 Step 5) The sequence generated by steps 1) through 4) was then 

modified according to allow subsequent generation of a modified NSOPTmut 
sequence (by homologous recombination). In the sequence obtained from steps 1) 
through 4) the segment comprising base 3556 to 3755 and the segment comprising 
base 4456 to 4656 were replaced by the corresponding segments from NSOPTmut. 

15 The segment comprising bases 3556 to 4656 of SEQ. ID. NO. 10 can be used to 

replace the problematic region in NSOPTmut (around position 3900) by homologous 
recombination thus creating the variant of NSOPTmut having the sequence of SEQ. 
ED. NO. 11. 

Step 6) Analysis of the sequence generated through steps 1) to 5) 

20 revealed a potential open reading frame spanning nearly the complete fragment on the 
complementary strand. Removal of all codons CTA and TTA (Leu) and TCA (Ser) 
from the sense strand effectively removed all stop codons in one of the reading frames 
on the complementary strand. Although the likelyhood for transcription of this 
complementary strand open reading frame and subsequent translation into protein is 

25 very small, in order to exclude a potential interference with the transcription and 

subsequent translation of the sequence encoded on the sense strand, TCA codons for 
Ser were introduced on the sense approximately every 500 bases. No changes were 
introduced in the segments introduced during step 5) to allow homologous 
recombination. The TCA codon for Ser was preferred over the CTA and TTA codons 

30 for Leu because of the higher relative frequency for TCA (0.05) as compared to CTA 
(0.02) and TTA (0.03) in human Jiigh.cod. In addition, the average human codon 
usage from CUTG favored TCA (0.14 against 0.07 for CTA and TTA). 

Step 7) In a final step GCCACC was added at the 5' end of the 
sequence to generate an optimized internal ribosome entry site (Kozak signal) and a 

35 TAAA stop sgnal was added at the 3'. To maintain the initiation of translation 



54 



■ .-'•'ft 

WO 03/031588 PCT/US02/32512 



properties of NSsuboptmut the first 8 codons of the coding region were kept identical 
to the NSOPTmut sequence. The resulting sequence was again checked for the . 
absence of Bgin, Pmel and Xbal recognition sites and the presence of only 1 StuI site: '/ 
The NSsuboptmut sequence (SEQ- ID. NO. 10) has an overall reduced ; 
5 GC content (63.5%) as compared to NSOPTmut (70.3%) and maintains a well 
optimized level of codon usage optimization. Nucleotide sequence identity of 
NSsuboptmut is 77.2% with respect to NSmut. 

Table 5: Definition of codon replacements performed during steps 1) and 2). 

10 ' ■ - 



Amino Acid 


Most frequent 
codon 


Relative 
frequency 


Reduction in 
GC content 
(bases) 


Replacement 
codon 


Relative . 
. frequency; . 


Amino Acids where the replacement codon reduces the codon GC-content by 1 base 


Ala 


GCC 


0.51 




GCT 


0.17 - 


Arg 


CGC 


0.37 




AGG 


0.18 


Asn 


AAC 


0.78 




AAT 


0.22 


Asp 


GAG 


0.75 




GAT 


0.25/ 


Cys 


TGC 


0.68 




TGT 


0.32 : 


Glu . 


GAG 


0.75 * 




GAA 


0.25 ; -v 


Gin 


CAG 


0.88 




. CAA 


0.12 


Gly 


GGC 


0.50 




GGA 


0.14': 


His 


CAC 


0.79 




CAT 


0,21 


lie 


ATC 


0.77 




ATT 


0.18 


Lys 


AAG 


0.82 




AAA 


o.i8 : 


Phe 


TTC 


0.80 




TTT 


0.20 


Pro 


CCC 


0.48 




CCT 


0.19 . 


Ser 


AGC 


0.34 




TCT 


0.13 - 


Thr 


ACC 


0.51 




ACA 


0.14 ~ 


Tyr 


TAC 


0.74 




TAT 


o.26 : 


Amino Acids with no alternative codon 


Met 


ATO 


1.00 


6 


ATG 


100 


Trp 


TGG 


1.00 


0 


TGG 


1.00? 
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Amino Acids where the replacement codon has a very 

excluded from the replacement procedure 



low relative frequency. These amino acids were 



Leu 



Val 



CTG 



GTG 



0.58 



0.64 



TTG 



GTT 



0.06 



0.07 



wim ple 12: ™ ™« Characterization 

Adenovectors were characterized by: (a) measuring the physical 
particles/ml; (b) running a TaqMan PCR assay; and (c) checking protein expression 
5 after infection of HeLa ceils. 

a) Physical Particles Determination 

CsCl purified virus was diluted 1/10 and 1/100 in 0. 1% SDS PBS. As 
a control, buffer A105 was used. These dilutions were incubated 10 minutes at 55 C. 
10 After spinning the tubes briefly, O.D. at 260 nm was measured. The ™*«^ 
particles was calculated as follows: 1 OD 260 nm = 1.1 X 10« 2 physical particles/ml. 
The results were typically between 5 X 10" and 1 X 10> 2 physical particles/ml. 

b) TaqMan PCR Assay tRMtinn 
' Taq Man PCR assay was used for adenovectors genome quantification 

(Q-PCR particles/ml). TaqMan PCR assay was performed using the ABI Prism 7700- 

sequence detector. The reaction was performed in a final 50 pi volume in the 

presence of oligonucleotides (at final 200 nM) and probe (at final 200 /iM) specific 

for the adenoviral backbone. The virus was diluted 1/10 in 0.1% SDS PBS and 

20 incubated 10 minutes at 55°C. After spinning the tube briefly, serial 1/10 dilutions (in 

water) were prepared. 10 fd the 10* 3 , 10 s and dilutions were used as templates in 

the PCR assay. , . 

The amount of particles present in each sample was calculated on the 

basis of a standard curve run in the same experiment. Typically results were between 
25 1 X 10 12 and 3 X 10 12 Q-PCR particles /ml. 

c) Expression ofHCV Non-Structural Proteins 

Expression of HCV NS proteins was tested by infection of HeLa cens. 
Cells were plated the day before the infection at 1.5 X 10* cells/dish (10 cm 0 Petri 
30 dishes). Different amounts of CsCl purified virus corresponding to m.o.i. of 50, 250 
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and 1250 pp/cell were diluted in medium (FCS free) up to a final volume of 5 ml. The ; 
diluted virus was added on the cells and incubated for 1 hour at 37°C in a C0 2 . - [^'j 
incubator (gently mixing every 20 minutes). 5 ml of 5% HS-DMEM was added and - 
the cells were incubated at 37°C for 48 hours. 

5 Cell extracts were prepared in 1% Triton/TEN buffer. The extracts • 

were run on 10% SDS-acrylamide gel, blotted on nitrocellulose and assayed with 
antibodies directed against NS3, NS5a and NS5b in order to check the correct 
polyprotein cleavage. Mock-infected cells were used as a negative control. Results 
from representative experiments testing the Ad5-NS, MRKAdS-NSmut, MRKAd6^ 

10 NSmut and MRKAd6-NS OPTmu t are shown in Figure 14. 

Example 13: Mice Immunization with Adenovectors Encodin g Different NS 
Cassettes 

The adenovectors Ad5-NS, MRKAdS-NSmut, MRKAd6-NSmut and 
15 MRKAd6-NSOPTmut were injected in C57Black6 mice strains to evaluate their 
potential to elicit anti-HCV immune responses. Groups of animals (N=9-10) were 
injected intramuscularly with 109 pp of CsCl purified virus. Each animal received two 
doses at three weeks interval. . - 

Humoral immune response against the NS3 protein was measured in 
20 post dose two sera from C57Black6 immunized mice by EOS A on bacterially 
expressed NS3 protease domain. Antibodies specific for the tested antigen were 
detected with geometric mean titers (GMT) ranging from 100 to 46000 (Tables 6, 7, 8 
and 9). 

25 Table 6: Ad5-NS 





GMT 


Mice n. 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 




Titer 


50 


253 


50 


50 


50 


2257 


504 


50 


50 


50 


108 
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TM*1- Ad5-NSmut 



GMT 



Mice 
n. 



Titer 



11 



3162 



12 



78850 



13 



87241 



14 



6796 



15 



12134 



16 



3340 



17 



18473 



13093 



76167 



49593 



23645 



T,hl*Kt MRKAd6-NSmut 











GMT 


Mice 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 




n. 

Titer 


125626 


39751 


40187 


65834 


60619 


69933 


21555 


49348 


29290 


26859 


46461 



10 



15 



20 



MRKAd6-NSOPTmut 







GMT 




31 


32 


33 


34 


35 


36 


37 




Mice n. 
Titer 


25430 


3657 


893 


175 


10442 


49540 


173 


2785 



T cell response in C57Black6 mice was analyzed by the quantitative 
EUSPOT assay measuring the number of IFNy secreting T cells in response to five 
pools (named from F to L+M) of 20mer peptides overlapping by ten res.dues 
encompassing the NS3-NS5B sequence. Specific CD8+ response induced m 
C57Black6 mice was analyzed by the same assay using a 20mer peptide 
encompassing a CD8 + epitope for C57Black6 mice (pe P 1480). Cells secreting IFNy 
in an antigen specific-manner were detected using a standard EUspot assay. 

Spleen cells, splenocytes and peptides were produced and treated as 
described in Example 3, supra. Representative data from groups of <^*"™~ 
(N=9-10) immunized with two injections of 10 9 viral particles of vectors Ad5-NS. 
MRKAdS-NSmut and MRKAdo-NSmut are shown in Figure 15. 

Exam ple 14: lir™"™™tion of & bS§u§ ™^ a <me S with Adenovectors 

Rhesus macaques (N=3-4) were immunized by intramuscular injection 
of CsCl purified Ad5-NS, MRKAdS-NSmut, MRKAd6-NSmut or MRKAd6- 
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NSOPTmut virus. Each animal received two doses of 10 n or 10 10 vp in the deltoid 

muscle at 0, and 4 weeks. 

CMI was measured at different time points by a) IFN-Y ELISPOT (see 

Example 3, supra), b) IFN-Y ICS and c) bulk CTL assays. These assays measure HCV 
5 antigen-specific CD8+ and CD4+ T lymphocyte responses, and can be used for a 

variety of mammals, such as humans, rhesus monkeys, mice, and rats. 

The use of a specific peptide or a pool of peptides can simplify antigen 

presentation in CTL cytotoxicity assays, interferon-gamma ELISPOT assays and * 

interferon-gamma intracellular staining assays. Peptides based on the amino acid 
10 sequence of various HCV proteins (core, E2, NS3, NS4A, NS4B, NS5a, NS5b) were 

prepared for use in these assays to measure immune responses in HCV DNA and 

adenovirus vector vaccinated rhesus monkeys, as well as in HCV-infected humans. 

The individual peptides are overlapping 20-mers, offset by 10 amino acids. Large;/ 

pools of peptides can be used to detect an overall response to HCV proteins while 
15 smaller pools and individual peptides may be used to define the epitope specificity of 

a response. ... 

IFN-ylCS 

For IFN-Y ICS, 2 x 106 PBMC in 1 ml R10 (RPMI medium, 
20 supplemented with 10% FCS) were stimulated with peptide pool antigens. Final 

concentration of each peptide was 2 |ig/mL Cells were incubated fbr I hour in a C62 
incubator at 37°C and then Brefelditi A was added to a final concentration of 10 \xg 
/ml to inhibit the secretion of soluble cytokines. Cells were incubated for additional 
14-16 houns at 37°C. 

25 Stimulation was done in the presence of cd-stimulatory antibodies: 

CD28 and CD49d (anti-humanCD28 BD340975 and anti-humanCD49d BD340976): 
After incubation, cells were stained with fluorochrome-conjugated antibodies for 
surface antigens: anti-CD3, anti-CD4, anti-CD8 (CD3-APC Biosource APS0301, . 
CD4-PE BD345769, CD8-PerCP BD345774). 

30 To detect intracellular cytokines, cells were treated with FACS 

permeabilization buffer 2 (BD340973), 2x final concentration. Once fixed and 
permeabilized, cells were incubated with an antibody against human IFN-Y, IFN- 
YFITC (Biosource AHC4338). 

Cells were resuspended in 1% formaldehyde in PBS and analyzed at 

35 FACS within 24 hours. Four color FACS analysis was performed on a FACSCalibur 
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instrument (Becton Dickinson) equipped with two lasers. Acquisition was done 
gating on the lymphocyte population in the Forward versus Side Scatter plot coupled 
with the CD3, CD8 positive populations. At least 30,000 events of the gate were 
taken. The positive cells are expressed as number of IFN-y expressing cells over 10 6 
5 lymphocytes. 

IFN-y ELISPOT and IFN-y ICS data from immunized monkeys after 
one or two injections of 10 10 or 10 11 vp of the different adenovectors are reported in 
Figures 16A-16D, 17 A, and 17B. 

10 Bulk CTL Assays 

A distinguishing effector function of T lymphocytes is the ability of 
subsets of this cell population to directly lyse cells exhibiting appropriate MHC- 
associated antigenic peptides. This cytotoxic activity is most often associated with 
CD8+ T lymphocytes. 

15 PBMC samples were infected with recombinant vaccine viruses 

expressing HCV antigens in vitro for approximately 14 days to provide antigen 
restimulation and expansion of memory T cells. Cytotoxicity against autologous B 
cell lines treated with peptide antigen pools was tested. 

The lytic function of the culture is measured as a percentage of specific 

20 lysis resulted from chromium released from target cells during 4 hours incubation 

with CTL effector cells. Specific cytotoxicity is measured and compared to irrelevant 
antigen or excipient-treated B cell lines. This assay is semi-quantitative and is the 
preferred means for determining whether CTL responses were elicited by the vaccine. 
Data after two injections from monkeys immunized with 10 n vp/dose with 

25 adenovectors Ad5-NS, MRKAdS-NSmut and MRKAd6-NSmut are reported in 
Figures 18A-18F. 



Other embodiments are within the following claims. While several 
embodiments have been shown and described, various modifications may be made 
30 without departing from the spirit and scope of the present invention. 
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. WHAT IS CLAIMED IS: 

1 . A nucleic acid comprising a nucleotide sequence encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to SEQ ID . 

5 NO: 1, provided that said polypeptide has sufficient protease activity to process itself 
to produce an NS5B protein and said NS5B protein is enzymatically inactive. 

■/■ 

2. The nucleic acid of claim 1, wherein said nucleotide sequence 
is substantially similar to the coding sequence of SEQ ID NO: 2. 

10 

-3. The nucleic acid of claim 1, wherein said nucleotide sequence 

encodes for the polypeptide of SEQ ID NO: 1. 

• » . . . ..." 

4. The nucleic acid of claim 3, wherein said nucleotide sequence , 
15 is the coding sequence of either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or 

SEQIDNO:ll. 

5. The nucleic acid of claim 3, wherein said nucleotide sequence 
is the coding sequence of either SEQ ID NO: 2 or SEQ ID NO: 3. . 

20 .' . . ; — . ' r. " 

6. The nucleic acid of any one of claims 1-5, wherein said nucleic 
acid is an expression vector capable of expressing said polypeptide from said 
nucleotide sequence in a human cell. 

25 7. A nucleic acid comprising a gene expression cassette able to 1 

express a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to 
SEQ ID NO: 1 in a human cell, provided that said polypeptide can process itself to 
produce an NS5B protein and said NS5B protein is enzymatically inactive, said 
expression cassette comprising: 

30 a) a promoter transcriptionally coupled to a nucleotide sequence : " 

encoding said polypeptide; 

b) a 5* ribosome binding site functionally coupled to said nucleotide , 

sequence, 
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c) a terminator joined to the 3' end of said nucleotide sequence, and 

d) a 3* polyadenylation signal functionally coupled to said nucleotide 

sequence. 

5 8, The nucleic acid of claim 7, wherein said nucleotide sequence 

is substantially similar to either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or 
SEQ ID NO: 11. 

9. The nucleic acid of claim 8, wherein said nucleic acid is a 

10 shuttle vector further comprising a selectable marker, an origin of replication, a first 
adenovirus homology region and a second adenovirus homology region flanking said 
expression cassette, wherein said first homology region has at least about 100 base 
pairs substantially homologous to at least right end of a wild-type adenovirus region 
from about base pairs 1-425, and said second homology region has at least about 100 

15 base pairs substantially homologous to at least the left end of a wild-type adenovirus 
region from about base pairs 3511-5792 of Ad5 or corresponding region of another 
adenovirus. 

10. The nucleic acid of claim 9, wherein said nucleotide sequence 
20 encodes for a polypeptide of SEQ ID NO: 1 . 

1 1 . The nucleic acid of claim 9, wherein said nucleotide sequence 
is SEQ ID NO: 2. 

25 12. The nucleic acid of claim 9, wherein said nucleotide sequence 

is either SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

13. The nucleic acid of claim 8, wherein said nucleic acid is a 
plasmid suitable for administration into a human and further comprises a prokaryotic 

30 origin of replication and a gene coding for a selectable marker. 

14. The nucleic acid of claim 13, wherein said nucleotide sequence 
encodes for a polypeptide of SEQ ID NO: 1. 
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15. The nucleic acid of claim 14, wherein said nucleotide sequence 
is the coding sequence of either SEQ ID NO: 2, SEQ ED NO: 3, SEQ ID NO: 10, or \" 
SEQIDNO:ll. 

5 16. The nucleic acid of claim 14, wherein said nucleotide sequence 

is the coding sequence of SEQ ID NO: 2 or SEQ ID NO: 3. r 

17. The nucleic acid of claim 14, wherein said promoter is the 
human intermediate early cytomegalovirus promoter (intron A), said 5' ribosome 

10 binding site consists of SEQ ID NO: 12, and said 3' polyadenylation is the bovine 
. growth hormone (BGH) polyadenylation signal. 

18. The nucleic acid of claim 8, wherein said nucleic acid is a t V 
adenovirus genome plasmid comprising a selectable marker, an origin of replication,: 

15 and a recombinant aderiovectoir genome containing an El deletion, an E3 deletion, 
and said expression cassette. 

19. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising a selectable marker, an origin of replication, 

20 and 

a) a first adenovirus region from about base pair 1 to about bksW;"' 
pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in a El parallel or El anti-parallel 
orientation joined to said first region; 

25 c) a second adenovirus region from about base pair 35 1 1 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3568 to about base pair 
5541 corresponding to Ad6, joined to said expression cassette; 

d) a third adenovirus region from about base pair 5549 to about; 
base pair 28 133 corresponding to Ad5 or from about base pair 5542 to about base pair 

30 28 156 corresponding to Ad6, joined to said second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 
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f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region. 

5 20. The nucleic acid of claim 19, wherein said first region 

corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 
corresponds to Ad5. 

10 21. The nucleic acid of claim 20, wherein said promoter is the 

human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3* polyadenylation is the BGH polyadenylation 
signal. 

15 22. The nucleic acid of claim 21, wherein said expression cassette 

is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 
2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 1 1. 

23. The nucleic acid of claim 19, wherein said first region 

20 corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

24. The nucleic acid of claim 23, wherein said promoter is the 
25 human intermediate early cytomegalovirus promoter, said 5* ribosome binding site 

consists of SEQ ID NO: 12, and said 3* polyadenylation is the BGH polyadenylation 
signal. 

25. The nucleic acid of claim 24, wherein said expression cassette 
30 is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 

2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

26. The nucleic acid of claim 24, wherein said expression cassette 
is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 

35 2 or SEQ ID NO: 3. 
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27. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising an origin of replication, a selectable marker, 
and: 

5 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 
10 c) a third adenovirus region from about base pair 5549 to about 

base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to said third region; 
15 e) a fourth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base r \ X 
20 pair 35759 corresponding to Ad6, joined to said fourth region. 

28. The nucleic acid of claim 27, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 

25 corresponds to Ad5. 

29. The nucleic acid of claim 28* wherein said promoter is the ; v 
human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ED NO: 12, and said 3* polyadenylation is the BGH polyadenylation*; 

30 signal. / 

30. The nucleic acid of claim 27, wherein said first region t ' 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 of Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 

35 region corresponds to Ad5 or Ad6. ! \ 
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31. The nucleic acid of claim 30, wherein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 

5 signal. 

32. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovector consisting of a nucleotide sequence substantially similar to of SEQ ID 
NO. 4 or a derivative thereof, wherein said derivative thereof has the HCV 

10 polyprotein encoding sequence present in SEQ ID NO: 4 replaced with the HCV 
polyprotein encoding sequence of either SEQ ID NO: 3, SEQ ID NO: 10 or SEQ ID 
NO: 11. 

33. The nucleic acid of claim 8, wherein said nucleic acid is an 

15 adenovector having an adenovector genome containing an El deletion, an E3 deletion, 
and said expression cassette 

34. The nucleic acid of claim 8, wherein said nucleic acid is an 
adenovector consisting of: 

20 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in a El parallel or El anti-parallel 
orientation joined to said first region; 

c) a second adenovirus region from about base pair 35 1 1 to about 
25 base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5541 corresponding to Ad6, joined to said expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

30 e) a fourth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

35 pair 35759 corresponding to Ad6, joined to said fourth region. 
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35. The nucleic acid of claim 34, wherein said first region > 
corresponds to Ad5, said second region corresponds to Ad5, said third region - 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 

5 corresponds to Ad5. 

36. The nucleic acid of claim 35, wherein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation;. 

10 signal. 

37. The nucleic acid of claim 36, wherein said expression cassette: 

' is in ail El and parallel orientation and said nucleotide sequence is either SEQ ID NO: 
2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

15 

38. The nucleic acid of claim 34, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

39. The nucleic acid of claim 37, where said promoter is the human 
intermediate early cytomegalovirus promoter, said 5' ribosome binding site consists V 
of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation signal. 

25 40. The nucleic acid of claim 39, wherein said expression cassette 

is in an El and parallel orientation and said nucleotide sequence is SEQ ID NO: 2, 
SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

41. The nucleic acid of claim 39> wherein said expression cassette 
30 is in aii El anti parallel orientation and said nucleotide sequence is SEQ ID NO: 2 or 

SEQ ID NO: 3, 

42. The nucleic acid of claim 8, wherein said nucleic acid is an 
aderiovector consisting of: 
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a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 35 1 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5 5541 corresponding to Ad6, joined to said first region; 

c) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28 1 56 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 

10 orientation joined to said third region; 

e) a fourth adenovirus region from about base pair 3081 8 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
15 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6, joined to said fourth region. 

43. The nucleic acid of claim 42, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 

20 corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 
corresponds to Ad5. 

44. The nucleic acid of claim 42, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 

25 region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

45. An adenovector consisting of the nucleic acid sequence of SEQ 
ID NO. 4 or a derivative thereof, wherein said derivative thereof has the HCV 

30 polyprotein encoding sequence present in SEQ ID NO: 4 replaced with the HCV 
polyprotein encoding sequence of either SEQ ID NO: 3, SEQ ID NO: 10 or SEQ ID 
NO: 11. 

46. An adenovector produced by a process comprising the steps of: 



68 



WO 03/031588 



a) producing an adenovirus genome plasmid by homologous \ > 
recombination between the shuttle vector of claim 9 and a nucleic acid comprising; 

a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
5 a second adenovirus region from about base pair 35 1 1 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
, 5541 corresponding to Ad6 % joined to said first region; 

a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair; 
10 28156 corresponding to Ad6, joined to said second region; 

a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base - \ 
pair 33784 corresponding to Ad6, joined to said third region; and 

a fifth adenovirus region from about base pair 33967 to about 
15 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6 f joined to said fourth region; and 

b) rescuing said adenovector from said adenovirus plasmid. 

A cultured recombinant cell comprising the nucleic acid of 

A cultured recombinant cell comprising the nucleic acid of any, 

25 49. A method of making an adenovector comprising the steps of f • 

a) producing an adenovirus genome plasmid comprising a gene . 
expression cassette by homologous recombination between the nucleic acid of claim 9 
and a nucleic acid comprising; 

a first adenovirus region from about base pair 1 to about base 

30 pair 450 corresponding to either Ad5 or Ad6; 

a second adenovirus region from about base pair 35 1 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pais * 
5541 corresponding to Ad6, joined to said first region; 



47. 

20 claim 6. 

48. 

one of claims 9-46. 



69 



WO 03/031588 



PCT/US02/32512 



a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

a fourth adenovirus region from about base pair 30818 to about 
5 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region; and 
10 b) rescuing said recombinant adenovirus from said recombinant 

adenovirus plasmid. 

50. A pharmaceutical composition comprising the nucleic acid of 
any one of claims 13-17 and 32-46 and pharmaceutically acceptable carrier. 

15 

51. A method of treating a patient comprising the step of 
administering to said patient an effective amount of the nucleic acid of any one of 
claims 13-17 and 32-46. 

20 52. The method of claim 5 1 , wherein said patient is a human. 

53. The method of claim 52, wherein said patient is not infected 

with HCV. 

25 54. The method of claim 52, wherein said patient is infected with 

HCV. 

55. A recombinant nucleic acid comprising one or more Ad6 
regions and a region not present in Ad6, wherein at least one Ad6 region is selected 

30 from the group consisting of: E1A, E1B, E2B, E2A, E4, LI, K2, L4, and L5. 

56. The recombinant nucleic acid of claim 55, wherein said region 
not present in Ad6, is an expression cassette coding for a polypeptide not found in 
Ad6. 

35 
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57. The recombinant nucleic acid of claim 56, wherein said \p\_ 
recombinant nucleic acid is an adenovirus vector defective in at least El that is able to 
replicate when El is supplied in trans. 

5 58. The recombinant nucleic acid of claim 57, wherein said vector 

consists of: . . . 

a) a first adenovirus region from about base pair 1 to about base. / 
pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in an El parallel or El anti- 
10 parallel orientation joined to said first region; 

c) a second adenovirus region from about base pair 35 1 1 to aboiiit • j 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said gene expression cassette; 

d) a third adenovirus region from about base pair 5549 to about : 
15 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

28 156 corresponding to Ad6, joined to said second region; 

e) an optionally present fourth region from about base pair 28134. 
to about base pair 30817 corresponding to Ad5, or from about base pair 28157 to 
about 30789 corresponding to Ad6, joined to said third region; 

20 f) a fifth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, wherein said fifth region is joined to said fourth ? £ 
region if said fourth region is present, or said fifth is joined to said third region if said 
fourth region is not present; and 

25 g) a sixth adenovirus region from about base pair 33967 to about 

base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base > 
pair 35759 corresponding to Ad6, joined to said fourth region; 

provided that at least one of said second, third, and fifth regions is 

from Ad6. 

59. The recombinant nucleic acid of claim 57, wherein said vector 

consists of: 

a) a first adenovirus region from about base pair 1 to about base " 
pair 450 corresponding to either Ad5 or Ad6; 



30 
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b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 

c) a third adenovirus region from about base pair 5549 to about 

5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28 1 56 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to said third region; 

e) a fourth adenovirus region from about base pair 30818 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region; 

15 provided that at least one of said second, third, and fourth regions is 

from Ad6. 



72 



WO 03/031588 



PCT/US02/32512 



1/92 

1 • MAPITAYSQQ TRGLLGCIIT SLTGRDKNQV EGEVQWSTA TQSFLATCVN 

51 GVCWTVYHGA GSKTLAGPKG -PITQMYTNVD QDLVGWQAPP GARSLTPCTC 

idl t GSSDLYLVTR ■ HADVIPVRRR GDSRGSLLSP RFVSYLKGSS GGPLLCPSGH 

i51 AVG I FRAAVC TRGVAKAVDF VPVESMETTM . RSPVFTDNSS PPAVPQSFQV . 

201 AHLHAPTGSG KSTKVPAAYA AQGYKVLVLN PSVAATLGFG AYMSKAHGID 

251 PNIRTGVRTI TTGAPVTYST- YGKFLADGGC SGGAYDI I IC . DEQHSTDSTT . 

30i ILGIGTVLDQ AETAGARLW LATATPPGSV TVPHPNIEEV ALSNTGEIPF 

351 YGKAIPIEAI RGGRHLI FCH SKKKCDELAA KL SGLGINAV AYYRGLDVSV 

401 . IPTIGDWW ATDALMTGYT GDFDSVIDCN TCVTQTVDFS LDPTFTIETT 

451 TVPQDAVSRS QRRGRTGRGR RGIYRFVTPG ERPSGMFDSS VLCECYDAGC 

501 AWYELTPAET SVRLRAYLNT PGLPVCQDHI* EFWESVFTGL THIDAHFLSQ 

551 TKQAGDNFPY LVAYQATVCA RAQAPPPSWD QMWKCLIRLK PTLHGPTPIiL. 

601 YRLGAVQNEV TLTHPITKYI MACMSADLEV VTSTWVLVGG VLAAL»AAYCL 

651- TTGSWIVGR IILSGRPAIV PDREFLYQEF DEMEECASHL PYIEQGMQLA 

701 EQFKQKALGL LQTATKQAEA AAPWESKWR AtETFWAKHM WNFISGIQYL 

751 AGLSTLPGNP AXASLMAFTA SITSPLTTQS TLLFNILGGW VAAQLAPPSA 

801 ASAFVGAGIA GAAVGS IGLG KVLVDILAGY GAGVAGALVA FKVMSGEMPS 

851 TEDLVNLLPA ILSPGALWG WCAAILRRH VGPGEGAVQW MNRLIAFASR 

901 GNHVSPTHYV PESDAAARVT QILSSLTITQ LLKRLHQWIN EDCSTPCSGS 

951 WLRDVWDWIC TVLTDFKTWL QSKLLPQLPG VPFF SCQRG Y KGVWRGDGIM 

1001 QTTCPCGAQI TGHVKNGSMR IVGPKTCSNT WHGTFP INAY TTGPCTPSPA 

1051 PNYSRALWRV AAEEYVEVTR VGDFHYVTGM TTDjsJVKCPCQ VPAPEFFTEV 

1101 DGVRLHRYAP ACRPLLREEV TFQVGLNQYL VGSQLPCEPE PDVAVLTSML 

1151 TDPSHlTAET AKRRLARGSP PSLASSSASQ LSAPSLKATC TTHHVSFDAD 

1201 LIEANLLWRQ EMGGNITRVE SENKWVLDS FDPLRAEEDE REVSVPAEIL 

1251 RKSKKFPAAM PIWARPDYNP PLLESWKDPD YVPPWHGCP LPPIKAPPIP 

1301 PPRRKRTWL TESSVSSALA ELATKTFGSS ESSAVDSGTA TALPDQASDD 

i35i GDKGSDVESY SSMPPLEGEP GDPDLSDGSW STVSEEASED WCCSMSYTW 

1401 TGALITPCAA EESKLPINAL SNSLLRHHNM VYATTSRSAG LRQKKVTFDR 

1451 LQVLDDHYRD VLKEMKAKAS TVKAKLLSVE EACKLTPFHS AKSKFGYGAK 

1501 DVRNLS SKAV NHIHSVWKDL LEDTVTPIDT TIMAKNEVFC VQPEKGGRKP 

1551 ARLIVFPDLG VRVCEKMALY DWSTLPQW MGSSYGFQYS PGQRVEFLVN 

1601 TWKSKKNPMG FSYDTRCFDS TVTENDIRVE ESIYQCCDLA PEAR^AIKSL 

1651 TERLYIGGPL TNSKGQNCGY RRCRASGVLT TSCGNTLTCY LKAS AACRAA 
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1701 KLQDCTMLVN AAGLWICES AGTQEDAASL RVFTEAMTRY SAPPGDPPQP 

1751 EYDLELITSC SSNVSVAHDA SGKRVYYLTR DPTTPLARAA WETARHTPVN 

1801 SWLGNIIMYA PTLWARMILM THFFS1LLAQ EQLEKALDCQ IYGACYSIEP 

1851 LDLPQIIERL HGLSAPSLHS YSPGEINRVA SCLRKLGVPP LRVWRHRARS 

1901 VRARLLSQGG RAATCGKYLF NWAVKTKLKL TPIPAASQLD LSGWFVAGYS 

1951 GGDIYHSLSR ARPRWFMLCL LLLSVGVGIY LLPNR 
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CGCCCATCAC GGCCTACTCC 
ATCACTAGCC TTACAGGCCG 
GGTGGTTTCC ACCGCAACAC 
TGTGTTGGAC CGTTTACCAT 
AAGGGGGCAA TCACCCAGAT 
CTGGCAGGCG CCCCCCGGGG 
GCTCAGACCT TTACTTGGTC 
CGGCGGGGCG ACAGf AGGGG 
CTTGAAGGGC TCTTCGGGTG 
TGGGCATCTT CCGGGCTGCC 
GACTTTGTGC CCGTAGAGTC 
CACGGACAAC TCATCCCCCC 
ACCTACACGG TCCCACTGGC 
TATGCAQCCC AAGGGTACAA 
TACCTTAGGG TTTGGGGCGT 
ACATCAGAAC TGGGGTAAGG 
TCTACCTATG GCAAGTTTCT 
TGACATCATA ATATGTGATG 
TGGGCATCGG CACAGTCCTG 
GTCGTGCTCG CCACCGCTAC 
AAACATCGAG GAGGTGGCCC 
GCAAAGCCAT CCCCATTGAA 
TGTCATTCCA AGAAGAAGTG 
CGGAATCAAC GCTGTGGCGT 
CAACTATCGG AGACGTCGTT 
TATACGGGCG ACTTTGACTC 
GACAGTCGAC TTCAGCTTGG 
TGCCTCAAGA CGCAGTGTCG 
GGTAGGAGAG GCATCTACAG 
CATGTTCGAT TCCTCGGTCG 
GGTACGAGCT CACCCCCGCC 
AACACACCAG GGTTGCCCGT, 
TGTCTTCAC A GGCCTCACCC 
AGCAGGCAGG AGACAACTTC 



PCT/US02/32512 



CAACAGACGC GGGGCCTACT . 
GGACAAGAAC CAGGTCGAGG 
AATCCTTCCT GGCGACCTGC 
GGTGCTGGCT CAAAGACCTT 
GTACACTAAT GTGGACCAGG * 
CGCGTTCCTT GACACCATGC 
ACGAGACATG CTGACGTCAT 
GAGCCTGCTC TCCCCCAGGG 
GTCCACTGCT CTGCCCTTCG 
GTATGCACCC GGGGGGTTGC 
CATGGAAACT ACTATGCGGT 
CGGCCGTACC GCAGTCATTTV . 
AGCGGCAAGA GTACTAAAGT 
GGTGCTCGTC CTCAATCCGT 
ATATGTCTAA GGCACACGGT 
ACCATTACCA CAGGGGCCCC 
TGCCGATGGT GGTTGCTCTG 
AGTGCCATTC AACTGACTCG 
GACCAAGCGG AGACGGCTGG 
GCCTCCGGGA TCGGTCACCG 
TGTCTAATAC TGGAGAGATC 
GCCATCAGGG GGGGAAGGCA 
CGACGAGCTC GCCGCAAAGC 
ATTACCGGGG GCTCGATGTG ' 
GTCGTGGCAA CAGACGCTCT 
AGTGATCGAC TGTAAC ACAT 
ATCCCACCTT CACCATTGAG . 
CGCTCGCAGC GGCGGGGTAG 
GTTTGTGACT CCGGGAGAAC 
TGTGTGAGTG CTA'l'GACGCG 
GAGACCTCGG TTAGGTTGCG 
TTGCCAGGAC CACCTGGAGT 
ACATAGATGC ACACTTCTTiS 
CCCTACCTGG TAGCATACCA 
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AGCCACGGTG 
TGTGGAAGTG 
TTGCTGTACA 
CATAACCAAA 
CTAGCACCTG 
TGCCTGACAA 
GAGGCCGGCT 
AAATGGAAGA 
CTCGCCGAGC 
CAAACAAGCG 
TTGAGACATT 
TACTTAGCAG 
GATGGCATTC 
TCCTGTTTAA 
AGCGCCGCTT 
CAGCATAGGC 
CAGGAGTGGC 
CCCTCCACCG 
CGCCCTGGTC 
GTCCGGGAGA 
TCGCGGGGTA 
CGCAGCGCGT 
TGAAAAGGCT 
GGCTCGTGGC 
CTTCAAGACC 
CTTTTTTCTC 
ATCATGCAAA 
AAACGGTTCC 
ATGGAACATT 
CCAGCGCCAA 
CGTGGAGGTC 
CTGACAACGT 
GAGGTGGACG 
CCTACGGGAG 



TGCGCCAGGG 
TCTCATACGG 
GGCTGGGAGC 
TACATCATGG 
GGTGCTGGTG 
CAGGCAGTGT 
ATTGTTCCCG 
GTGCGCCTCG 
AATTCAAGCA 
GAGGCTGCTG 
CTGGGCGAAG 
GCTTATCCAC 
ACAGCCTCTA 
CATCTTGGGG 
CGGCTTTCGT 
CTTGGGAAGG 
CGGCGCGCTC 
AGGACCTGGT 
GTCGGGGTCG 
GGGGGCTGTG 
ATCATGTTTC 
GTTACTCAGA 
CCACCAGTGG 
TAAGGGATGT 
TGGCTCCAGT 
GTGCCAACGC 
CCACCTGCCC 
ATGAGGATCG 
CCCCATCAAC 
ACTATTCTAG 
ACGCGGGTGG 
AAAGTGCCCA 
GAGTGCGGTT 
GAGGTTACAT 



CTCAGGCCCC 
CTGAAACCTA 
CGTCCJAAAT 
CATGCATGTC 
GGCGGAGTCC 
GGTCATTGTG 
ACAGGGAGTT 
CACCTCCCTT 
GAAAGCGCTC 
CTCCCGTGGT 
CACATGTGGA 
TCTGCCTGGG 
TCACCAGCCC 
GGGTGGGTGG 
GGGCGCCGGC 
TGCTTGTGGA 
GTGGCCTTCA 
CAATCTACTT 
TGTGTGCAGC 
CAGTGGATGA 
CCCCACGCAC 
TCCTCTCCAG 
ATTAATGAAG 
TTGGGACTGG 
CCAAGCTCCT 
GGGTACAAGG 
ATGTGGAGCA 
TCGGGCCTAA 
GCATACACCA 
GGCGCTGTGG 
GGGATTTCCA 
TGCCAGGTTC 
GCACAGGTAC 
TCCAGGTCGG 



ACCTCCATCA 
CGCTGCACGG 
GAGGTCACCC 
GGCTGACCTG 
TTGCAGCTCT 
GGTAGGATTA 
TCTCTACCAG 
ACATCGAGCA 
GGGTTACTGC 
GGAGTCCAAG 
ATTTCATCAG 
AACCCCGCAA 
GCTCACCACC 
CTGCCCAACT 
ATCGCCGGTG 
CATTCTGGCG 
AGGTCATGAG 
CCTGCCATCC 
AATACTGCGT 
ACCGGCTGAT 
TATGTGCCTG 
CCTTACCATC 
ACTGCTCCAC 
ATATGCACGG 
GCCGCAGCTA 
GAGTCTGGCG 
CAGATCACCG 
GACCTGCAGC 
CGGGCCCCTG 
CGGGTGGCCG 
CTACGTGACG 
CGGCTCCTGA 
GCTCCGGCGT 
GCTCAACCAA 



TGGGATCAAA 
GCCAACACCC 
TCACCCACCC 
GAGGTCGTCA 
GGCCGCGTAT 
TCTTGTCCGG 
GAGTTCGATG 
GGGAATGCAG 
AAACAGCCAC 
TGGCGAGCCC 
CGGGATACAG 
TAGCATCATT 
CAAAGTACCC 
CGCCCCCCCC 
CGGCTGTTGG 
GGTTATGGAG 
CGGCGAGATG 
TCTCTCCTGG 
CGACACGTGG 
AGCGTTCGCC 
AGAGCGACGC 
ACTCAGCTGC 
ACCGTGTTCC. 
TGTTGACTGA 
CCGGGAGTCC 
GGGAGACGGC 
GACATGTCAA 
AACACGTGGC 
CACACCCTCT 
CTGAGGAGTA 
GGCATGACCA 
ATTCTTCACG 
GCAGGCCTCT 
TACCTGGTTG 
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CCCGAACCGG ATGTAGCAGT GCTCACTTCG 
CATCACAGCA GAAACGGCTA AGCGTAGGTT 
CCTTGGCCAG CTCTTCAGCT AGCCAGTTGT 
ACATGCACTA CCCACCATGT CTCTCCGGAC 
CCTCCTGTGG CGGCAGGAGA TGGGCGGGAA 
AGAACAAGGT GGTAGTCCTG GACTCTTTCG 1 
GATGAGAGGG AAGTATCGGT TCCGGCGGAG 
GTTCCCCGCA GCGATGCCCA TCTGGGCGCG 
TGTTAGAGTC. GTGGAAGGAC CCGGACTACG 
TGCCCGTTGC CACCTATCAA GGCCCCTCCA 
GAGGACGGTT GTCCTAACAG . AGTCCTCCGT ; 
TCGCTACTAA GACCTTCGGC AGCTCCGAAT 
ACGGCGACCG CCCTTCCTGA CCAGGCCTCC 
CGACGTTGAG TCGTACTCCT CCATGCCCCC . 
ACGCCGATCT CAGTGACGGG TCTTGGTCTA 
GAGGATGTCG TCTGCTGCTC AATGTCCTAC 
CACGCCATGC GCTGCGGAGG AAAGCAAGCT 
ACTCTTTGCT GCGCCACCAT . AACATGGTTT 
GGAGGCCTGC GGCAGAAGAA GGTCACCTTT 
CGACCACTAC CGGGACGTGC TCAAGGAGAT 
TTAAGGCTAA ACTCCTATCC GTAGAGGAAG 
CATTCGGCCA AATCCAAGTT TGGCTATGGG 
ATCCAGCAAG GCCGTTAACC ACATCCACTC 
AAGACACTGT GACACCAATT GACACCACCA 
TTCTGTGTCC AACCAGAGAA AGGAGGCCGT 
ATTCCCAGAT CTGGGAGTCC GTGTATGCGA 
TGGTCTCCAC CCTTCCTCAG GTCGTGATGG 
TACTCTCCTG GGCAGCGAGT CGAGTTCCTG 
GAAAAACCCC ATGGGCTTTT CATATGACAC 
TCACCGAGAA CGACATCCGT G^TGAGGAGT 
TTGGCCCCCG AAGCCAGACA . GGCCATAAAA 
TATCGGGGGT CCTCTGACTA ATTCAAAAGG 
GGTGCCGCGC GAGCGGCGTG CTGACGACTA 
TGTTACTTGA AGGCCTCTGC AGCCTGTCGA 
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CCTGCGAAGC TCCAGGACTG CACGATGCTC GTGAACGCCG CCGGCC™ 
CGTTATCTGT GAAAGCGCGG GAACCCAAGA GGACGCGGCG AGCCTACGAG 
TCTTCACGGA GGCTATGACT AGGTACTCTG CCCCCCCCGG ^CCGCCC 
CAACCAGAAT ACGACTTGGA GCTGATAACA TCATGTTCCT CCAA^ 
GGTCGCCCAC GATGCATCAG GCAAAAGGGT GTACTACCTC ACCCGTGATC 
CCACCACCCC CCTCGCACGG GCTGCGTGGG AAACAGCTAG ACACACTCCA 

gSaactcct ggctaggcaa cattatcatg tatgcgccca ctttgtgggc 

AAGGATGATT CTGATGACTC ACTTCTTCTC CATCCTTCTA GCACAGGAGC 
AACTTGAAAA AGCCC^GAC TGCCAGATCT ACGGGGCCTG TTACTCCATT 
GAGCCACTTG ACCTACCTCA GATCATTGAA CGACTCCATG GCCTTAGCGC 
ATTTTCACTC CATAGTTACT CTCCAGGTGA GATCAATAGG GTGGCTTCAT 
GCCTCAGGAA ACTTGGGGTA CCACCCTIGC GAGTCTGGAG ACATCGGGCC 
AGGAGCGTCC GCGCTAGGCT ACTGTCCCAG GGGGGGAGGG CCGCCACrTG 

"™ac ctc^aact gggcag^aa gaccaaactc aaactcac^c 

CAATCCCGGC TGCGTCCCAG CTGGACTTGT CCGGCTGGTT CGTTGCTGGT 
TACAGCGGGG GAGACATATA TCACAGCCTG TCTCGTGCCC GACCCCGCIG 
GTTCATGCTG TGCCTACTCC TACTTTCTGT AGGGGTAGGC ATCTACCTGC 



5951 TCCCCAACCG ATAAA 
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1 - GCCACCATGG CCCCCATCAC CGCCTACAGC CAGCAGACCC GCGGCCTGCT 

51 GGGCTGCATC ATCACCAGCC TGACCGGCCG CGACAAGAAC CAGGTGGAGG 

101 GCGAGGTGCA GGTGGTGAGC ACCGCCACCC AGAGCTTCCT GGCCACCTGC 

i5l GTGAACGGCG TGTGCTGGAC CGTGTACCAC GGCGCCGGCA GCAAGACCCT 

201 GGCCGGCCCC AAGGGCCCCA TCACCCAGAT GTACACCAAC GTGGACCAGG 

251 ACCTGGTGGG CTGGCAGGCC CCCCCCGGCG CCCGCAGCCT GACCCCCTGC 

301 ACCTGCGGCA GCAGCGACCT GTACCTGGTG ACCCGCCACG CCGACGTGAT 

351 CCCCGTGCGC CGCCGCGGCG ACAGCCGCGG CAGCCTGCTG AGCCCCCGCC 

401 CCGTGAGCTA CCTGAAGGGC AGCAGCGGCG GCCCCCTGCT GTGCCCCAGC 

451 GGCCACGCCG TGGGCATCTT CCGCGCCGCC GTGTGCACCC GCGGCGTGGC 

501 CAAGGCCGTG GACTTCGTGC CCGTGGAGAG CATGGAGACC ACCATGCGCA 

551 GCCCCGTGTT CACCGACAAC AGCAGCCCCC CCGGCGTGCC CCAGAGCTTC 

601 CAGGTGGCCC ACCTGCACGC CCCCACCGGC AGCGGCAAGA GCACCAAGGT 

651 GCCCGCCGCC TACGCCGCCC AGGGCTACAA GGTGCTGGTG CtGAACCCCA- 

701 GCGTGGCCGC CACCCTGGGC TTCGGCGCCT ACATGAGCAA GGCCCACGGC 

751 ATCGACCCCA ACATCC6CAG CGGCGTGCGC ACCATCACCA CCGGCGCGCC 

801 CGTGACCTAC AGCACCTACG GCAAGTTCCT GGCCGACGGC GGCTGCAGCG 

851 GCGGCGCCTA CGACATCATC ATCTGCGACG AGTGCC ACAG CACCGACAGC 

901 ACCACCATCC TGGGGATCGG CACCGTGCTG GACCAGGCCG AGACCGCCGG 

95 i CGCCCGCCTG GTGGTGCTGG CCACCGCCAC CCCCCCCGGC AGCGTGACCG 

iOOl TGCCCCACCC CAACATCGAG GAGGTGGCCC TGAGCAACAC CGGCGAGATC 

1051 CCCTTCTACG GCAAGGCCAT CCCCATCGAG GCCATCCGCG GCGGCCGCCA 

1101 CCTGATCTTC TGCCACAGCA AGAAGAAGTG CGACGAGCTG GCCGCCAAGC 

1151 TGAGCGGCCT GGGCATCAAC GCCGTGGCCT ACTACCGCGG CCTGGACGTG 

1201 AGCGTGATCC CCACCATCGG CGACGTGGTG GTGGTGGCCA CCGACGCCCT 

125.1 . GATG ACCGGC TACACCGGCG ACTTCGACAG CGTGATCGAC TGCAACACCT 

1301 GCGTGACCCA GACCGTGGAC TTCAGCCTGG ACCCCACCTT CACCATCGAG 

1351 ACCACCACCG TGCCCCAGGA CGCCGTGAGC CGCAGCCAGC GCCGCGGCCG 

1401 GACCGGCCGC GGCCGCCGCG GCATCTACCG CTTCGTGACC CCCGGCGAGC 

14 5 1 GCCCCAGCGG CATGTTCGAC AGCAGCGTGC TGTGCGAGTG CTACGACGCC 
1501 GGCTGCGCCT GGTACGAGCT GACCCCCGCC GAGACCAGCG TGCGCCTGCG 

15 5 i CGCCTACCTG AACACCCCCG GCCTGCCCGT GTGCCAGGAC CACCTGGAGT 
1601 TCTGGGAGAG CGTGTTCACC GGCCTGACCC ACATCGACGC CCACTTCCTG. 
1651 AGCCAGACCA AGCAGGCCC5G CGACAACTTC CCCTACCTGG TGGCGTACdA 
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,,01 GGCCACCGTG TGCGCCCGCG CCCAGGCCCC CCCCCCCAGC TGGGACCAGA 
" ~CTG CCTGATCCGC CTGAAGCCCA CCCTGCACGG CCCC^CCCCC 
"oi ctgctgtacc gcctgggcgc CGTGCAGAAC gaggtgaccc tgacccaccc 
'.si S^CCAAG TACATCATGG CCTGCATGAG CGCCGACCTO GAGGTGGTGA 
"oi CCAGCACCTG GGTGCTGGTG GCCGCCGTGC tggccgccct ggccgcctac 
i s ^CTCACCA CCCGCAGCGT GGTGATCCTC GGCCCCATCA TCCTSAGC^ 
2001 CCGCCCCGCC ATCGTGCCCG ACCGCGAGTT CCTGTACCAG ^^^^J^^ 
2 0 5 i AGATGGAGGA GTCCGCCAGC CACCTCCCCT ACATCGAGCA GGGCATGCAG 

01 c^cogagc agttcaagca gaaggccct* ggcctgctcc agaccgccag 

2151 CAAGGAGGCG GAGGCGGCCG CCGCCGTGGT GGAGAGCAAG ^GC^ 
"„i TGGAGACCTT CTGGGGCAAG CACATGTGGA ACTTCATCAG «««»> 
,,S1 TACCTGGCCG GCCTCAGCAC CCTCCCCGGC AACCCCGCCA TCGCGAGCCT 

23 oi GATGGCCTTC accgccagga tcaccagccg cctoagcacc ga»«^cc 

" 51 TGCTGTTCAA CATCCTGGGC GGCTGGGTGG CCGCCCAGCT GGCCCCCCCC 

2 0! AGCGCCGCCA GCGCCPTCGT GGGCGCGGGC ATCGCCGGCG CO«™» 

,1S1 CAGCATCGGC CTGGGCAAGG TGCTGGTGGA CATCCTGGCC GGCTACGGCG 

25 oi CCGGCGTGGC CGGCGCCCTG G^GCCTTCA AGGTGATGAG CGGCGAGATG 

"si CCCAGCACCG AGGACCTGGT GAACCTGCTG CCCGCCATCC TGAGCCCCGG 

2 «oi CGCCCTGGTG GTGGGCGTGG TGTGCGGCGC CATCCTGGGC GGCCACGTGG 

'esi ScCGGCGA GGGCGCCGTG CAGTGGATGA ACCGCCTGAT CGCCTTCGGC 

,701 AGCCGCGGCA ACCACGTGAG CCCCACCCAC TACGTGCCCG AGAGGGACGC 

,7 5 i CGCCGCCCGC GTGACCCAGA TCCTGAGCAG CCTGACCATC ACCCAGCTCC 

a." T^GCCT GCACCAGTGG ATCAACGAGG AGTGCAGCAC CCCCTGCAGC 

■ 2851 Sc^GCTOGC TGCGCGACGT GTGGGACTGG ATCTGCACCG TGCTGACCGA 

■ 2 i c^Lacc TGGCTGCAGA gcaagctgct ggcccaggtg CCCG3CGTGC 

2 51 CCTTCTTCAG CTGGCAGCGC GGCTACAAGG GCGTGTGGCG "BOOK 
300! ATCATGCAGA GCACCTGCCC GTGCGGCGCC CAGATCACCG GCCACGTGAA 
so gILggcagc ATGCGCATCG T^GCCCCCAA GACCTGCAGC AACACCTGGC 

l a ACGGGACCTT CCCCATCAAC GCCTACACCA CCCGCCCCTG CACCCCCAGC 

3!" CCCGCCCCCA ACTACAGCCG CGCCCTGTGG CGCGTGGCCG CCGAGGAGTA 

32 „i CGTGGAGGTG ACCCGCGTGG GCGACTTCCA CTACGTGACC GGCATGACCA 

Si ™CGT GAAGTGCCCC TGCCAGCTGC CCGCCCCCGA GTTCTTCACC 

"oi GAGGTGGACG GCOtSCGCCT GCACCGCTAC GCCCCCGCCT GCCGCCCCCT 

33S1 GCTGCGCGAG GAGGTGACCT TCCAGGTCGG CCTGAACCAG TACCTGGTGG 
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3401 GCAGCCAGCT GCCCTGCGAG CCCGAGCCCG ACGTGGCCGT GCTGACCAGC 

3451 ATGCTGACCG ACCCCAGCCA CATCACCGCG GAGACCGCCA AGCGCCGCCT 

3501 GGCCCGCGGC AGCCCCCCCA GCCTGGCCAG CAGCAGCGCC AGCCAGCTGA 

3551 GCGCCCCCAG CCTGAAGGCC ACCTGCACCA CCCACCACGT GAGCGCCGAG 

3601 GCCGACCTGA TCGAGGCCAA CCTGCTGTGG CGCCAGGAGA TGGGCGGCAA- 

3651 CATCACCCGC GTGGAGAGCG AGAACAAGGT GGTGGTGCTG GACAGCTTCG 

3?01 ACCCCCTGCG CGCCGAGGAG GACGAGCGCG AGGTGAGCGT GCCCGCCGAG 

3751 ATCCTGCGCA AGAGCAAGAA GTTCCCCGCC GCCATGCCCA TOTGGGCCCG 

3801 CCCCGACTAG AACCCCCCCC TGCTGGAGAG CTGGAAGGAC CCCGACTACG ' 

3851 TGCCCCCCGT GGTGCACGGC TGCCCCCTGC CCCCCATCAA GGCCCCCCCC 

3901 ATCCCCCCCC CCCGCCGCAA GCGCACCGTG GTGCTGACCG AGAGCAGCGT 

3951 GAGCAGCGCC CTGGCCGAGC TGGCCACCAA GACCTTCGGC AGCAGCGAGA A 

4001 GCAGCGCCGT GGACAGCGGC ACCGCCACCG CCCTGCCCGA CCAGGCCAGC 
'4051 . GACGACGGCG ACAAGGGCAG CGACGTGGAG AGCTACAGCA GCATGGCCCG 

4101 CCTGGAGGGC GAGCCCGGCG ACGCCGACCT GAGCGACGGC AGCTGGAGCA 

4151 CCGTGAGCGA GGAGGCCAGC GAGGACGTGG TGTGCTGCAG CATGAGCTAC 

4201 ACCTGGACCG GCGCCCTGAT CACCCCCTGC GCCGCCGAGG AGAGCAAGCT 

4251 GCCCATCAAC GCCCTGAGCA ACAGCCTGCT GCGCCACCAC AACATGGTGT 

4301 ACGCCACCAC CAGCCGCAGC GCCGGCCTGC GCCAGAAGAA GGTGACCTTC , 

4351 GACCGCCTGC AGGTGCTGGA CGACCACTAC CGCGACGTGC TGAAGGAGAT 

4401 GAAGGCCAAG GCCAGCACCG TGAAGGCCAA GCTGCTGAGC GTGGAGGAGG 

4451 CCTGCAAGCT GACCCCCCCC CACAGCGCCA AGAGCAAGTT CGGC^TACGGC 

4501 GCCAAGGACG TGCGCAACCT GAGCAGCAAG GCCGTGAACC ACATCCACAG 

4551 CGTGTGGAAG GACCTGCTGG AGGACACCGT GACCCCCATC GACACCACCA 

4601 TCATGGCCAA GAACGAGGTG TTCTGCGTGC AGCCCGAGAA GGGCGGCCGfC 

4651 AAGCCCGCCC GCCTGATCGT GTTCCCCGAC CTGGGCGTGC GCGTGTGCGA 

4701 GAAGATGGCC CTGTACGACG TGGTGAGCAC CCTGCCCCAG GTGGTGATGG 

4751 GCAGCAGCTA CGGCTTCCAG TACAGCCCCG GCCAGCGCGT C3GAGTTCCTG 

4801 GTGAACACCT GGAAGAGCAA GAAGAACCCC ATGGGCTTCA GCTACGACAC 

4851 CCGCTGCTTC GACAGCACCG TGACCGAGAA CGACATCCGC GTGGAGGAGA 

4901 GCATCTACCA GTGCTGCGAC CTGGCCCCCG AGGCCCGCCA GGCCATCAAG 

4951 AGCCTGACCG AGCGCCTGTA CATCGGCGGC CCCCTGACCA ACAGC AAGGG 

5001 CCAGAACTGC GGCTACCGCC GCTGCCGCGC CAGCGGCGTG CTGACCACCA 

5051 GCTGCGGCAA CACCCTGACC TGCTACCTGA AGGCCAGCGC CGCCTGCCGC 
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GCCGCCAAGC TGCAGGACTG CACCATGCTG GTGAACGCCG CCGGCCTGGT 
GGTGATCTGC GAGAGCGCCG GCACCCAGGA GGACGCCGCC AGCCTGCGCG 
TGTTCACCGA GGCCATGACC CGCTACAGCG CCCCCCCCGG CGACCCCCCC 
CAGCCCGAGT ACGACCTGGA GCTGATCACC AGCTGCAGCA GCAACGTGAG 
CGTGGCCCAC GACGCCAGCG GCAAGCGCGT GTACTACCTG ACCCGCGACC 
CCACCACCCC CCTGGCCCGC GCCGCCTGGG AGACCGCCCG CCACACCCCC 
GTGAACAGCT GGCTGGGCAA CATCATCATG TACGCCCCCA CCCTGTGGGC 
CCGCATGATC CTGATGACCC ACTTCTTCAG CATCCTGCTG GCCCAGGAGC 
AGCTGGAGAA GGCCCTGGAC TGCCAGATCT ACGGCGCCTG CTACAGCATC 
GAGCCCCTGG ACCTGCCCCA GATCATCGAG CGCCTGCACG GCCTGAGCGC 
CTTCAGCCTG CACAGCTACA GCCCCGGCGA GATCAACCGC GTGGCCAGCT 
GCCTGCGCAA GCTGGGCGTG CCCCCCCTGC GCGTGTGGCG CCACCGCGCC 
CGC AGCGTGC GCGCCCGCCT GCTGAGCCAG GGCGGCCGCG CCGCCACCTG 
CGGCAAGTAC CTGTTCAACT GGGCCGTGAA GACCAAGCTG AAGCTGACCC 
CCATCCCCGC CGCCAGCCAG CTGGACCTGA GCGGCTGGTT CGTGGCCGGC 
TACAGCGGCG GCGACATCTA CCACAGCCTG AGCCGCGCCC GCCCCCGCTG 
GTTCATGCTG TGCCTGCTGC TGCTGAGCGT GGGCGTGGGC ATCTACCTGC 



5951 TGCCCAACCG CTAAA 
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1 catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt 
61 ttgtgacgtg gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 
121 gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 
181 gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 
241 taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 
301 agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 
361 gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 
421 cgggtcaaag ttggcgtttt attattatag gcggccgcga tccattgcat acgttgtatc 
481 catatcataa tatgtacatt tatattggct catgtccaac attaccgcca tgttgacatt 
541 gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 
601 tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 
661 cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 
721 attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 
781 atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 
841 atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 
901 tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 
961 actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 
1021 aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 
1081 gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac cgtcagatcg 
1141 cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgggac cgatccagcc 
1201 tccgcggccg ggaacggtgc attggaacgc ggattccccg tgccaagagt gagatctgcc 

12 61 accatggcgc ccatcacggc ctactcccaa cagacgcggg gcctacttgg ttgcatcatc 
1321 actagcctta caggccggga caagaaccag gtcgagggag aggttcaggt ggtttccacc 

13 81 gcaacacaat ccttcctggc gacctgcgtc aacggcgtgt gttggaccgt ttaccatggt 
1441 gctggctcaa agaccttagc cggcccaaag gggccaatca cccagatgta cactaatgtg 
1501 gaccaggacc tcgtcggctg gcaggcgccc cccggggcgc gttccttgac accatgcacc 
1561 tgtggcagct cagaccttta cttggtcacg agacatgctg acgtcattcc ggtgcgccgg 
1621 cggggcgaca gtagggggag cctgctctcc cccaggcctg tctcctactt gaagggctct 
1681 tcgggtggtc cactgctctg cccttcgggg cacgctgtgg gcatcttccg ggctgccgta. 
1741 tgcacccggg gggttgcgaa ggcggtggac tttgtgcccg tagagtccat ggaaactact 
1801 atgcggtctc cggtcttcac ggacaactca tcccccccgg ccgtaccgca gtcatttcaa 
1861 gtggcccacc tacacgctcc cactggcagc ggcaagagta ctaaagtgcc ggctgcatat 
1921 gcagcccaag ggtacaaggt gctcgtcctc aatccgtccg ttgccgctac cttagggttt 
1981 ggggcgtata tgtctaaggc acacggtatt gaccccaaca tcagaactgg ggtaaggacc 
2041 attaccacag gcgcccccgt cacatactct acctatggca agtttcttgc cgatggtggt 
2101 tgctctgggg gcgcttatga catcataata tgtgatgagt gccattcaac tgactcgact 
2161 acaatcttgg gcatcggcac agtcctggac caagcggaga cggctggagc gcggcttgtc 
2221 gtgctcgcca ccgctacgcc tccgggatcg gtcaccgtgc cacacccaaa catcgaggag 
2281 gtggccctgt ctaatactgg agagatcccc ttctatggca aagccatccc cattgaagcc 
2341 atcagggggg gaaggcatct cattttctgt cattccaaga agaagtgcga cgagctcgcc 
2401 gcaaagctgt caggcctcgg aatcaacgct gtggcgtatt accgggggct cgatgtgtcc 
2461 gtcataccaa ctatcggaga cgtcgttgtc gtggcaacag acgctctgat gacgggctat 
2521 acgggcgact ttgactcagt gatcgactgt aacacatgtg tcacccagac agtcgacttc 
2581 agcttggatc ccaccttcac cattgagacg acgaccgtgc ctcaagacgc agtgtcgcgc 
2641 tcgcagcggc ggggtaggac tggcaggggt aggagaggca tctacaggtt tgtgactccg 
2701 ggagaacggc cctcgggcat gttcgattcc tcggtcctgt gtgagtgcta tgacgcgggc 
2761 tgtgcttggt acgagctcac ccccgccgag acctcggtta ggttgcgggc ctacctgaac 
2821 acaccagggt tgcccgtttg ccaggaccac ctggagttct gggagagtgt cttcacaggc 
2881 ctcacccaca tagatgcaca cttcttgtcc cagaccaagc aggcaggaga caacttcccc 
2941 tacctggtag cataccaagc cacggtgtgc gccagggctc aggccccacc tccatcatgg 
3001 gatcaaatgt ggaagtgtct catacggctg aaacctacgc tgcacgggcc aacacccttg 
3061 ctgtacaggc tgggagccgt ccaaaatgag gtcaccctca cccaccccat aaccaaatac 
3121 atcatggcat gcatgtcggc tgacctggag gtcgtcacta gcacctgggt gctggtgggc 
3181 ggagtccttg cagctctggc cgcgtattgc ctgacaacag gcagtgtggt cattgtgggt 
3241 aggattatct tgtccgggag gccggctatt gttcccgaca gggagtttct ctaccaggag 
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3361 gccgagcaat tcaagcagaa agcgctcggg ttactgcaaa cagccaccaa acaagcggag 
3421 gctgctgctc ccgtggtgga gtccaagtgg cgagcccttg agacattctg ggcgaagcac ;. 
3481 atgtggaatt tcatcagcgg gatacagtac ttagcaggct tatccactct gcctgggaac. 
3541 cccgcaatag catcattgat ggcattcaca gcctctatca ccagcccgct caccacccaa. 
3601 agtaccctcc tgtttaacat cttggggggg tgggtggctg cccaactcgc cccccccagc 
3661 gccgcttcgg ctttcgtggg cgccggcatc gccggtgcgg ctgttggcag cataggcctt 
3721 gggaaggtgc ttgtggacat tctggcgggt tatggagcag gagtggccgg cgcgctcgtg 
3781 gccttcaagg tcatgagcgg cgagatgccc tccaccgagg acctggtcaa tctacttcct 
3841 gccatcctct ctcctggcgc cctggtcgtc ggggtcgtgt gtgcagcaat actgcgtcga 
3901 cacgtgggtc cgggagaggg ggctgtgcag tggatgaacc ggctgatagc gttcgcctcg 
3961 cggggtaatc atgtttcccc cacgcactat gtgcctgaga gcgacgccgc agcgcgtgtt. 
4021 actcagatcc tctccagcct taccatcact cagctgctga aaaggctcca ccagtggatt 
4081 aatgaagact gctccacacc gtgttccggc tcgtggctaa gggatgtttg ggactggata 
414i tgcacggtgt tgactgactt caagacctgg ctccagtcca agctcctgcc gcagctaccg 
4201 ggagtccctt ttttctcgtg ccaacgcggg tacaagggag tctggcgggg agacggcatc 
4261 atgcaaacca cctgcccatg tggagcacag atcaccggac atgtcaaaaa cggttccatg 
4321 aggatcgtcg ggcctaagac ctgcagcaac acgtggcatg gaacattccc catcaacgca 
4381 tacaccacgg gcccctgcac accctctcca gcgccaaact attctagggc gctgtggcgg 
" 4441 gtggccgctg aggagtacgt ggaggtcacg cgggtggggg atttccacta cgtgacgggc 
4501 atigaccactg acaacgtaaa gtgcccatgc caggttccgg ctcctgaatt cttcacggag 
4561 gtggacggag tgcggttgca caggtacgct ccggcgtgca ggcctctcct acgggaggag 
4621 gttacattcc aggtcgggct caaccaatac ctggttgggt cacagctacc atgcgagccc 
4681 gaaccggatg tagcagtgct cacttccatg ctcaccgacc cctcccacat cacagcagaa^ 
4741 acggctaagc gtaggttggc cagggggtct cccccctcct tggccagctc ttcagctagc 
4801 cagttgtctg cgccttcctt gaaggcgaca tgcactaccc accatgtctc tccggacgct 
4861 gacctcatcg aggccaacct cctgtggcgg caggagatgg gcgggaacat cacccgcgtg 
4921 gagtcggaga acaaggtggt agtcctggac tctttcgacc cgcttcgagc ggaggaggat 
4981 gagagggaag tatccgttcc ggcggagatc ctgcggaaat ccaagaagtt ccccgcagcg 
5041 atgcccatct gggcgcgccc ggattacaac cctccactgt tagagtcctg gaaggacccg 
5101 gactacgtcc ctccggtggt gcacgggtgc ccgttgccac ctatcaaggc ccctccaata 
5161 ccacctccac ggagaaagag gacggttgtc ctaacagagt cctccgtgtc ttctgcctta 
5221 gcggagctcg ctactaagac cttcggcagc tccgaatcat cggccgtcga cagcggcacg 
5281 gcgaccgccc ttcctgacca ggcctccgac gacggtgaca aaggatccga cgttgagtcg 
5341 tactcctcca tgccccccct tgagggggaa ccgggggacc ccgatctcag tgacgggtct 
5401 tggtctaccg tgagcgagga agctagtgag gatgtcgtct gctgctcaat gtcctacaca 
5461 tggacaggcg ccttgatcac gccatgcgct gcggaggaaa gcaagctgcc catcaacgcg 
5521 ttgagcaact ctttgctgcg ccaccataac atggtttatg ccacaacatc tcgcagcgca 
5581 ggcctgcggc agaagaaggt cacctttgac agactgcaag tcctggacga ccactaccgg 
5641 gacgtgctca aggagatgaa ggcgaaggcg tccacagtta aggctaaact cctatccgta 
5701 gaggaagcct gcaagctgac gcccccacat tcggccaaat ccaagtttgg ctatggggca 
5761 aaggacgtcc ggaacctatc cagcaaggcc gttaaccaca tccactccgt gtggaaggac 
5821 ttgctggaag acactgtgac accaattgac accaccatca tggcaaaaaa tgaggttttc 
5881 tgtgtccaac cagagaaagg aggccgtaag ccagcccgcc ttatcgtatt cccagatctg 
5941 ggagtccgtg tatgcgagaa gatggccctc tatgatgtgg tctccaccct tcctcaggtc 
6001 gtgatgggct cctcatacgg attccagtac tctcctgggc agcgagtcga gttcctggtg 
6061 aatacctgga aatcaaagaa aaaccccatg ggcttttcat atgacactcg ctgtttcgac 
6121 tcaacggtca ccgagaacga catccgtgtt gaggagtcaa tttaccaatg ttgtgacttg 
6181 gcccccgaag ccagacaggc cataaaatcg ctcacagagc ggctttatat cgggggtcct 
6241 ctgactaatt caaaagggca gaactgcggt tatcgccggt gccgcgcgag cggcgtgctg 
6301 acgactagct gcggtaacac cctcacatgt tacttgaagg cctctgcagc ctgtcgagct 
6361 gcgaagctcc aggactgcac gatgctcgtg aacgccgccg gccttgtcgt tatctgtgaa 
6421 agcgcgggaa cccaagagga cgcggcgagc ctacgagtct tcacggaggc tatgactagg 
6481 tactctgccc cccccgggga cccgccccaa ccagaatacg acttggagct gataacatca 
6541 tgttcctcca atgtgtcggt cgcccacgat gcatcaggca aaagggtgta ctacctcacc 
6601 cgtgatccca ccacccccct cgcacgggct gcgtgggaaa cagctagaca cactccagtt 
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6661 aactcctggc taggcaacat tatcatgtat 
6721 atgactcact tcttctccat ccttctagca 
6781 cagatctacg gggcctgtta ctccattgag 
6841 ctccatggcc ttagcgcatt ttcactccat 
6901 gcttcatgcc tcaggaaact tggggtacca 
6961 agcgtccgcg ctaggctact gtcccagggg 
7021 ttcaactggg cagtgaagac caaactcaaa 
7081 gacttgtccg gctggttcgt tgctggttac 
7141 cgtgcccgac cccgctggtt catgctgtgc 
7201 tacctgctcc ccaaccggta aatctagagc 
7261 ttgcccctcc cccgtgcctt ccttgaccct 
7321 ataaaatgag gaaattgcat cgcattgtct 
7381 ggtggggcag gacagcaagg gggaggattg 
7441 ggtgggctct atggccgatc ggcgcgccgt 
7501 ggaaagaata tataaggtgg gggtcttatg 
7561 gccgccatga gcaccaactc gtttgatgga 
7621 atgcccccat gggccggggt gcgtcagaat 
7681 gtcctgcccg caaactctac taccttgacc 
7741 actgcagcct ccgccgccgc ttcagccgct 
7801 tttgctttcc tgagcccgct tgcaagcagt 
7861 aagttgacgg ctcttttggc acaattggat 
7921 cagcagctgt tggatctgcg ccagcaggtt 
7981 gcggtttaaa acataaataa aaaaccagac 
8041 tgctgtcttt atttaggggt tttgcgcgcg 
8101 ttgagggtcc tgtgtatttt ttccaggacg 
8161 atgggcataa gcccgtctct ggggtggagg 
8221 gtggtgttgt agatgatcca gtcgtagcag 
8281 ttcagtagca agctgattgc caggggcagg 
8341 agctgggatg ggtgcatacg tggggatatg 
8401 gctatgttcc cagccatatc cctccgggga 
8461 tatccggtgc acttgggaaa tttgtcatgt 
8521 gagacgccct tgtgacctcc aagattttcc 
8581 ccacgggcgg cggcctgggc gaagatattt 
8641 aggatgagat cgtcataggc catttttaca 
8701 ataatggttc catccggccc aggggcgtag 
8761 ttgagttcag atggggggat catgtctacc 
8821 gtaggggaga tcagctggga agaaagcagg 
8881 gtgggcccgt aaatcacacc tattaccggc 
8941 ccgtcatccc tgagcagggg ggccacttcg 
9001 ctgaccaaat ccgccagaag gcgctcgccg 
9061 aagtttttca acggtttgag accgtccgcc 
9121 agttccaggc ggtcccacag ctcggtcacc 
9181 cctcgtttcg cgggttgggg cggctttcgc 
9241 gggccagggt catgtctttc cacgggcgca 
9301 tgaaggggtg cgctccgggc tgcgcgctgg 
9361 tgctgaagcg ctgccggtct tcgccctgcg 
9421 catagtccag cccctccgcg gcgtggccct 
9481 cgcacgaggg gcagtgcaga cttttgaggg 
9541 ccggggagta ggcatccgcg ccgcaggccc 
9601 tgagctctgg ccgttcgggg tcaaaaacca 
9661 tacctctggc ttccatgagc cggtgtccac 
9721 cgtatacaga cttgagaggc ctgtcctcga 
9781 actcggacca ctctgagacg aaggctcgcg 
9841 aggggtagcg gtcgttgtcc actagggggt 
9901 cgccctcttc ggcatcaagg aaggtgattg 



gcgcccactt tgtgggcaag gatgattctg 
caggagcaac ttgaaaaagc cctggactgc 
ccacttgacc tacctcagat cattgaacga 
agttactctc caggtgagat caatagggtg 
cccttgcgag tctggagaca tcgggccagg 
gggagggccg ccacttgtgg caagtacctc 
ctcactccaa tcccggctgc gtcccagctg 
agcgggggag acatatatca cagcctgtct 
ctactcctac tttctgtagg ggtaggcatc 
tgtgccttct agttgccagc catctgttgt 
ggaaggtgcc actcccactg tcctttccta 
gagtaggtgt cattctattc tggggggtgg 
ggaagacaat agcaggcatg ctggggatgc 
actgaaatgt gtgggcgtgg cttaagggtg 
tagttttgta tctgttttgc agcagccgcc 
agcattgtga gctcatattt gacaacgcgc 
gtgatgggct ccagcattga tggtcgcccc 
tacgagaccg tgtctggaac gccgttggag 
gcagccaccg cccgcgggat tgtgactgac 
gcagcttccc gttcatccgc ccgcgatgac 
tctttgaccc gggaacttaa tgtcgtttct 
tctgccctga aggcttcctc ccctcccaat 
tctgtttgga tttggatcaa gcaagtgtct 
cggtaggccc gggaccagcg gtctcggtcg 
tggtaaaggt gactctggat gttcagatac 
tagcaccact gcagagcttc atgctgcggg 
gagcgctggg cgtggtgcct aaaaatgtct 
cccttggtgt aagtgtttac aaagcggtta 
agatgcatct tggactgtat ttttaggttg 
ttcatgttgt gcagaaccac cagcacagtg 
agcttagaag gaaatgcgtg gaagaacttg 
atgcattcgt ccataatgat ggcaatgggc 
ctgggatcac taacgtcata gttgtgttcc 
aagcgcgggc ggagggtgcc agactgcggt 
ttaccctcac agatttgcat ttcccacgct 
tgcggggcga tgaagaaaac ggtttccggg 
ttcctgagca gctgcgactt accgcagccg 
tgcaactggt agttaagaga gctgcagctg 
ttaagcatgt ccctgactcg catgttttcc 
cccagcgata gcagttcttg caaggaagca 
gtaggcatgc ttttgagcgt ttgaccaagc 
tgctctacgg catctcgatc cagcatatct 
tgtacggcag tagtcggtgc tcgtccagac 
gggtcctcgt cagcgtagtc tgggtcacgg 
ccagggtgcg cttgaggctg gtcctgctgg 
cgtcggccag gtagcatttg accatggtgt 
tggcgcgcag cttgcccttg gaggaggcgc 
cgtagagctt gggcgcgaga aataccgatt 
cgcagacggt ctcgcattcc acgagccagg 
ggtttccccc atgctttttg atgcgtttct 
gctcggtgac gaaaaggctg tccgtgtccc 
gcggtgttcc gcggtcctcc tcgtatagaa 
tccaggccag cacgaaggag gctaagtggg 
ccactcgctc cagggtgtga agacacatgt 
gtttataggt gtaggccacg tgaccgggtg 
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9961 ttcctgaagg ggggctataa aagggggtgg gggcgcgttc gtcctcactc tcttccgcat « 
10021 cgctgtctgc gagggccagc tgttggggtg agtactccct ctcaaaagcg ggcatgactt 
10081 ctgcgctaag attgtcagtt tccaaaaacg aggaggattt gatattcacc tggcccgcgg 
10141 tgatgccttt gagggtggcc gcgtccatct ggtcagaaaa gacaatcttt ttgttgtcaa 
10201 gcttggtggc aaacgacccg tagagggcgt tggacagcaa cttggcgatg gagcgcaggg 
10261 tttggttttt gtcgcgatcg gcgcgctcct tggccgcgat gtttagctgc acgtattcgc 
10321 gcgcaacgca ccgccattcg ggaaagacgg tggtgcgctc gtcgggcact aggtgcacgc 
10381 gccaaccgcg gttgtgcagg gtgacaaggt caacgctggt ggctacctct ccgcgtaggc 
10441 gctcgttggt ccagcagagg cggccgccct tgcgcgagca gaatggcggt agtgggtcta 

.10501 gctgcgtctc gtccgggggg tctgcgtcca cggtaaagac cccgggcagc aggcgcgcgt 
10561 cgaagtagtc tatcttgcat ccttgcaagt ctagcgcctg ctgccatgcg cgggcggcaa 
10621 gcgcgcgctc gtatgggttg agtgggggac cccatggcat ggggtgggtg agcgcggagg 
10681 cgtacatgcc gcaaatgtcg taaacgtaga ggggctctct gagtattcca agatatgtag 
10741 ggtagcatct tccaccgcgg atgctggcgc gcacgtaatc gtatagttcg tgcgagggag 
10801 cgaggaggtc gggaccgagg ttgctacggg cgggctgctc tgctcggaag actatctgcc 
10861 tgaagatggc atgtgagttg gatgatatgg ttggacgctg gaagacgttg aagctggcgt 
10921 ctgtgagacc taccgcgtca cgcacgaagg aggcgtagga gtcgcgcagc ttgttgacca 
10981 gctcggcggt gacctgcacg tctagggcgc agtagtccag ggtttccttg atgatgtcat 

1 11041 acttatcctg tccctttttt ttccacagct cgcggttgag gacaaactct tcgcggtctt 
11101 tccagtactc ttggatcgga aacccgtcgg cctccgaacg gtaagagcct agcatgtaga 
11161 actggttgac ggcctggtag gcgcagcatc ccttttctac gggtagcgcg tatgcctgcg 
il221 cggccttccg gagcgaggtg tgggtgagcg caaaggtgtc cctaaccatg actttgaggt 
11281 actggtattt gaagtcagtg tcgtcgcatc cgccctgctc ccagagcaaa aagtccgtgc 
11341 gctttttgga acgcgggttt ggcagggcga aggtgacatc gttgaagagt atctttcccg. 
11401 cgcgaggcat aaagttgcgt gtgatgcgga agggtcccgg cacctcggaa cggttgttaa 
11461 ttacctgggc ggcgagcacg atctcgtcaa agccgttgat gttgtggccc acaatgtaaa 
11521 gttccaagaa gcgcgggatg cccttgatgg aaggcaattt tttaagttcc tcgtaggtga 
11581 gctcttcagg ggagctgagc ccgtgctctg aaagggccca gtctgcaaga tgagggttgg 
11641 aagcgacgaa tgagctccac aggtcacggg ccattagcat ttgcaggtgg tcgcgaaa^g 
11701 tcctaaactg gcgacctatg gccatttttt ctggggtgat gcagtagaag gtaagcgggt 
11761 cttgttccca gcggtcccat ccaaggtccg cggctaggtc tcgcgcggcg gtcactagag 
11821 gctcatctcc gccgaacttc atgaccagca tgaagggcac gagctgcttc ccaaaggccc 
11881 ccatccaagt ataggtctct acatcgtagg tgacaaagag acgctcggtg cgaggatgcg 
11941 agccgatcgg gaagaactgg atctcccgcc accagttgga ggagtggctg ttgatgtggt 
12001 gaaagtagaa gtccctgcga cgggccgaac actcgtgctg gcttttgtaa aaacgtgcgc 
12061 agtactggca gcggtgcacg ggctgtacat cctgcacgag gttgacctga cgaccgcgca 
12121 caaggaagca gagtgggaat ttgagcccct cgcctggcgg gtttggctgg tggtcttcta 
12181 cttcggctgc ttgtccttga ccgtctggct gctcgagggg agttacggtg gatcggacca 
12241 ccacgccgcg cgagcccaaa gtccagatgt ccgcgcgcgg cggtcggagc ttgatgacaa 
12301 catcgcgcag atgggagctg tccatggtct ggagctcccg cggcgtcagg tcaggcggga 
12361 gctcctgcag gtttacctcg catagccggg tcagggcgcg ggctaggtcc aggtgatacc 
12421 tgatttccag gggctggttg gtggcggcgt cgatggcttg caagaggccg catccccgcg 
12481 gcgcgactac ggtaccgcgc ggcgggcggt gggccgcggg ggtgtccttg gatgatgcat 
12541 ctaaaagcgg tgacgcgggc gggcccccgg aggtaggggg ggctcgggac ccgccgggag 
12601 agggggcagg ggcacgtcgg cgccgcgcgc gggcaggagc tggtgctgcg cgcggaggtt 
12661 gctggcgaac gcgacgacgc ggcggttgat ctcctgaatc tggcgcctct gcgtgaagac 
12721 gacgggcccg gtgagcttga acctgaaaga gagttcgaca gaatcaattt cggtgtcgtt 
12781 gacggcggcc tggcgcaaaa tctcctgcac gtctcctgag ttgtcttgat aggcgatctc 
12841 ggccatgaac tgctcgatct cttcctcctg gagatctccg cgtccggctc gctccacggt 
12901 ggcggcgagg tcgttggaga tgcgggccat gagctgcgag aaggcgttga ggcctccctc 
12961 gttccagacg cggctgtaga ccacgccccc ttcggcatcg cgggcgcgca tgaccacctg 
13021 cgcgagattg agctccacgt gccgggcgaa gacggcgtag tttcgcaggc gctgaaagag 
13081 gtagttgagg gtggtggcgg tgtgttctgc cacgaagaag tacataaccc agcgccgcaa 
13141 cgtggattcg ttgatatccc ccaaggcctc aaggcgctcc atggcctcgt agaagtccac 
13201 ggcgaagttg aaaaactggg agttgcgcgc cgacacggtt aactcctcct ccagaagacg 
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13261 gatgagctcg gcgacagtgt cgcgcacctc gcgctcaaag gctacagggg cctcttcttc 
13321 ttcttcaatc tcctcttcca taagggcctc cccttcttct tcttctggcg gcggtggggg 
13381 aggggggaca cggcggcgac gacggcgcac cgggaggcgg tcgacaaagc gctcgatcat 
13441 ctccccgcgg cgacggcgca tggtctcggt gacggcgcgg ccgttctcgc gggggcgcag 
13501 ttggaagacg ccgcccgtca tgtcccggtt atgggttggc ggggggctgc cgtgcggcag 
13561 ggatacggcg ctaacgatgc atctcaacaa ttgttgtgta ggtactccgc caccgaggga 
13 621 cctgagcgag tccgcatcga ccggatcgga aaacctctcg agaaaggcgt ctaaccagtc 
13681 acagtcgcaa ggtaggctga gcaccgtggc gggcggcagc gggcggcggt cggggttgtt 
13741 tctggcggag gtgctgctga tgatgtaatt aaagtaggcg gtcttgagac ggcggatggt 
13801 cgacagaagc accatgtcct tgggtccggc ctgctgaatg cgcaggcggt cggccatgcc 
13861 ccaggcttcg ttttgacatc ggcgcaggtc tttgtagtag tcttgcatga gcctttctac 
13921 cggcacttct tcttctcctt cctcttgtcc tgcatctctt gcatctatcg ctgcggcggc 
13981 ggcggagttt ggccgtaggt ggcgccctct tcctcccatg cgtgtgaccc cgaagcccct 
14041 catcggctga agcagggcca ggtcggcgac aacgcgctcg gctaatatgg cctgctgcac 
14101 ctgcgtgagg gtagactgga agtcgtccat gtccacaaag cggtggtatg cgcccgtgtt 
14161 gatggtgtaa gtgcagttgg ccataacgga ccagttaacg gtctggtgac ccggctgcga 
14221 gagctcggtg tacctgagac gcgagtaagc ccttgagtca aagacgtagt cgttgcaagt 
14281 ccgcaccagg tactggtatc ccaccaaaaa gtgcggcggc ggctggcggt agaggggcca 
14341 gcgtagggtg gccggggctc cgggggcgag gtcttccaac ataaggcgat gatatccgta 
14401 gatgtacctg gacatccagg tgatgccggc ggcggtggtg gaggcgcgcg gaaagtcacg 
14461 gacgcggttc cagatgttgc gcagcggcaa aaagtgctcc atggtcggga cgctctggcc 
14521 ggtcaggcgc gcgcagtcgt tgacgctcta gaccgtgcaa aaggagagcc tgtaagcggg 
14581 cactcttccg tggtctggtg gataaattcg caagggtatc atggcggacg accggggttc 
14641 gaaccccgga tccggccgtc cgccgtgatc catgcggtta ccgcccgcgt gtcgaaccca 
14701 ggtgtgcgac gtcagacaac gggggagcgc tccttttggc ttccttccag gcgcggcgga 
14761 tgctgcgcta gcttttttgg ccactggccg cgcgcggcgt aagcggttag gctggaaagc 
14821 gaaagcatta agtggctcgc tccctgtagc cggagggtta ttttccaagg gttgagtcgc 
14881 gggacccccg gttcgagtct cgggccggcc ggactgcggc gaacgggggt ttgcctcccc 
14941 gtcatgcaag accccgcttg caaattcctc cggaaacagg gacgagcccc ttttttgctt 
15001 ttcccagatg catccggtgc tgcggcagat gcgcccccct cctcagcagc ggcaagagca 
15061 agagcagcgg cagacatgca gggcaccctc cccttctcct accgcgtcag gaggggcaac 
15121 atccgcggct gacgcggcgg cagatggtga ttacgaaccc ccgcggcgcc ggacccggca 
15181 ctacttggac ttggaggagg gcgagggcct ggcgcggcta ggagcgccct ctcctgagcg 
15241 acacccaagg gtgcagctga agcgtgacac gcgcgaggcg tacgtgccgc ggcagaacct 
15301 gtttcgcgac cgcgagggag aggagcccga ggagatgcgg gatcgaaagt tccatgcagg 
15361 gcgcgagttg cggcatggcc tgaaccgcga gcggttgctg cgcgaggagg actttgagcc 
15421 cgacgcgcgg accgggatta gtcccgcgcg cgcacacgtg gcggccgccg acctggtaac 
15481 cgcgtacgag cagacggtga accaggagat taactttcaa aaaagcttta acaaccacgt 
15541 gcgcacgctt gtggcgcgcg aggaggtggc tataggactg atgcatctgt gggactttgt 
15601 aagcgcgctg gagcaaaacc caaatagcaa gccgctcatg gcgcagctgt tccttatagt 
15661 gcagcacagc agggacaacg aggcattcag ggatgcgctg ctaaacatag tagagcccga 
15721 gggccgctgg ctgctcgatt tgataaacat tctgcagagc atagtggtgc aggagcgcag 
15781 cttgagcctg gctgacaagg tggccgccat taactattcc atgctcagtc tgggcaagtt 
15841 ttacgcccgc aagatatacc atacccctta cgttcccata gacaaggagg taaagatcga 
15901 ggggttctac atgcgcatgg cgctgaaggt gcttaccttg agcgacgacc tgggcgttta 
15961 tcgcaacgag cgcatccaca aggccgtgag cgtgagccgg cggcgcgagc tcagcgaccg 
16021 cgagctgatg cacagcctgc aaagggccct ggctggcacg ggcagcggcg atagagaggc 
16081 cgagtcctac tttgacgcgg gcgctgacct gcgctgggcc ccaagccgac gcgccctgga 
16141 ggcagctggg gccggacctg ggctggcggt ggcacccgcg cgcgctggca acgtcggcgg 
16201 cgtggaggaa tatgacgagg acgatgagta cgagccagag gacggcgagt actaagcggt 
16261 gatgtttctg atcagatgat gcaagacgca acggacccgg cggtgcgggc ggcgctgcag 
16321 agccagccgt ccggccttaa ctccacggac gactggcgcc aggtcatgga ccgcatcatg 
16381 tcgctgactg cgcgcaaccc tgacgcgttc cggcagcagc cgcaggccaa ccggctctcc 
16441 gcaattctgg aagcggtggt cccggcgcgc gcaaacccca cgcacgagaa ggtgctggcg 
16501 atcgtaaacg cgctggccga aaacagggcc atccggcccg atgaggccgg cctggtctac 
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16561 gacgcgctgc ttcagcgcgt ggctcgttac aacagcagca acgtgcagac caacctggac 
16621 cggctggtgg gggatgtgcg cgaggccgtg gcgcagcgtg agcgcgcgca gcagcagggc 
16681 aacctgggct ccatggttgc actaaacgcc ttcctgagta cacagcccgc caacgtgccg 
16741 cggggacagg aggactacac caactttgtg agcgcactgc ggctaatggt gactgagaca 
16801 ccgcaaagtg aggtgtatca gtccgggcca gactattttt tccagaccag tagacaaggc 
16861 ctgcagaccg taaacctgag ccaggctttc aagaacttgc aggggctgtg gggggtgcgg 
16921 gctcccacag gcgaccgcgc gaccgtgtct agcttgctga cgcccaactc gcgcctgttg 
16981 ctgctgctaa tagcgccctt cacggacagt ggcagcgtgt cccgggacac atacctaggt 
17041 cacttgctga cactgtaccg cgaggccata ggtcaggcgc atgtggacga gcatactttc 
17101 caggagatta caagtgttag ccgcgcgctg gggcaggagg acacgggcag cctggaggca 
17161 accctgaact acctgctgac caaccggcgg caaaaaatcc cctcgttgca cagtttaaac 
17221 agcgaggagg agcgcatttt gcgctatgtg cagcagagcg tgagccttaa cctgatgcgc 
17281 gacggggtaa cgcccagcgt ggcgctggac atgaccgcgc gcaacatgga accgggcatg 
17341 tatgcctcaa accggccgtt tatcaatcgc ctaatggact acttgcatcg cgcggccgcc 
17401 gtgaaccccg agtatttcac caatgccatc ttgaacccgc actggctacc gccccctggt 
17461 ttctacaccg ggggattcga ggtgcccgag ggtaacgatg gattcctctg ggacgacata 
17521 gacgacagcg tgttttcccc gcaaccgcag accctgctag agttgcaaca acgcgagcag 
17 581 gcagaggcgg cgctgcgaaa ggaaagcttc cgcaggccaa gcagcttgtc cgatctaggc 
17641 gctgcggccc cgcggtcaga tgctagtagc ccatttccaa gcttgatagg gtctcttacc 
17701 agcactcgca ccacccgccc gcgcctgctg ggcgaggagg agtacctaaa caactcgctg 
17761 ctgcagccgc agcgcgaaaa gaacctgcct ccggcgtttc ccaacaacgg gatagagagc 
17821 ctagtggaca agatgagtag atggaagacg tatgcgcagg agcacaggga tgtgcccggc 
17881 ccgcgcccgc ccacccgtcg tcaaaggcac gaccgtcagc ggggtctggt gtgggaggac 
17941 gatgactcgg cagacgacag cagcgtcttg gatttgggag ggagtggcaa cccgtttgca 
18001 caccttcgcc ccaggctggg gagaatgttt taaaaaaaag catgatgcaa aataaaaaac 
18061 tcaccaaggc catggcaccg agcgttggtt ttcttgtatt ccccttagta tgcggcgcgc 
18121 ggcgatgtat gaggaaggtc ctcctccctc ctacgagagc gtggtgagcg cggcgccagt 
18181 ggcggcggcg ctgggttcac ccttcgatgc tcccctggac ccgccgttcg tgcctccgcg 
18241 gtacctgcgg cctaccgggg ggagaaacag catccgttac tctgagttgg cacccctatt 
18301 cgacaccacc cgtgtgtacc ttgtggacaa caagtcaacg gatgtggcat ccctgaacta 
18361 ccagaacgac cacagcaact ttctaaccac ggtcattcaa aacaatgact acagcccggg 
18421 ggaggcaagc acacagacca tcaatcttga cgaccggtcg cactggggcg gcgacctgaa 
18481 aaccatcctg cataccaaca tgccaaatgt gaacgagttc atgtttacca ataagtttaa 
18541 ggcgcgggtg atggtgtcgc gctcgcttac taaggacaaa caggtggagc tgaaatacga 
18601 gtgggtggag ttcacgctgc ccgagggcaa ctactccgag accatgacca tagaccttat 
18661 gaacaacgcg atcgtggagc actacttgaa agtgggcagg cagaacgggg ttctggaaag 
18721 cgacatcggg gtaaagtttg acacccgcaa cttcagactg gggtttgacc cagtcactgg 
18781 tcttgtcatg cctggggtat atacaaacga agccttccat ccagacatca ttttgctgcc 
18841 aggatgcggg gtggacttca cccacagccg cctgagcaac ttgttgggca tccgcaagcg 
18901 gcaacccttc caggagggct ttaggatcac ctacgatgac ctggagggtg gtaacattcc 
18961 cgcactgttg gatgtggacg cctaccaggc aagcttgaaa gatgacaccg aacagggcgg 
19021 gggtggcgca ggcggcggca acaacagtgg cagcggcgcg gaagagaact ccaacgcggc 
19081 agctgcggca atgcagccgg tggaggacat gaacgatcat gccattcgcg gcgacacctt 
19141 tgccacacgg gcggaggaga agcgcgctga ggccgaggca gcggccgaag ctgccgcccc 
19201 cgctgcggag gctgcacaac ccgaggtcga gaagcctcag aagaaaccgg tgattaaacc 
19261 cctgacagag gacagcaaga aacgcagtta caacctaata agcaatgaca gcaccttcac 
19321 ccagtaccgc agctggtacc ttgcatacaa ctacggcgac cctcaggccg ggatccgctc 
19381 atggaccctg ctttgcactc ctgacgtaac ctgcggctcg gagcaggtat actggtcgtt 
19441 gcccgacatg atgcaagacc ccgtgacctt ccgctccacg cgccagatca gcaactttcc 
19501 ggtggtgggc gccgagctgt tgcccgtgca ctccaagagc ttctacaacg accaggccgt 
19561 ctactcccag ctcatccgcc agtttacctc tctgacccac gtgttcaatc gctttcccga 
19621 gaaccagatt ttggcgcgcc cgccagcccc caccatcacc accgtcagtg aaaacgttcc 
19681 tgctctcaca gatcacggga cgctaccgct gcgcaacagc atcggaggag tccagcgagt 
19741 gaccattact gacgccagac gccgcacctg cccctacgtt tacaaggccc tgggcatagt 
19801 ctcgccgcgc gtcctatcga gccgcacttt ttgagcaagc atgtccatcc ttatatcgcc 
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23161 ctgcctacaa cgcactggcc cccaagggtg cccccaactc gtgcgagtgg gaacaaaatg, 
23221 aaactgcaca agtggatgct caagaacttg acgaagagga gaatgaagcc aatgaagctc ; 
23281 aggcgcgaga acaggaacaa gctaagaaaa cccatgtata tgcccaggct ccactgtccg 
23341 gaataaaaat aactaaagaa ggtctacaaa taggaactgc cgacgccaca gtagcaggtg' 1 
23401 ccggcaaaga aattttcgca gacaaaactt ttcaacctga accacaagta ggagaatctc 

* 23461 aatggaacga agcggatgcc acagcagctg gtggaagggt tcttaaaaag acaactccca 
23521 tgaaaccctg ctatggctca tacgctagac ccaccaattc caacggcgga cagggcgtta.. 
23581 tggttgaaca aaatggtaaa ttggaaagtc aagtcgaaat gcaatttttt tccacatcca 
23641 caaatgccac aaatgaagtt aacaatatac aaccaacagt tgtattgtac agcgaagatg 

.23701 taaacatgga aactccagat actcatcttt cttataaacc taaaatgggg gataaaaatg. 
23761 ccaaagtcat gcttggacaa caagcaatgc caaacagacc aaattacatt gcttttagag 

• 23821 acaattttat tggtctcatg tattacaaca gcacaggtaa catgggtgtc cttgctggtc 
23881 aggcatcgca gttgaacgct gttgtagatt tgcaagacag aaacacagag ctgtcctacc 
23941 agcttttgct tgattcaatt ggcgacagaa caagatactt ttcaatgtgg aatcaagctg 
24001 ttgacagcta tgatccagat gtcagaatta ttgagaacca tggaactgag gatgagttgc 
24061 caaattattg ctttcctctt ggtggaattg ggattactga cacttttcaa gctgttaaaa 
24121 caactgctgc taacggggac caaggcaata ctacctggca aaaagattca acatttgcag' 
24181 aacgcaatga aataggggtg ggaaataact ttgccatgga aattaacctg aatgccaacc 
24241 tatggagaaa tttcctttac tccaatattg cgctgtacct gccagacaag ctaaaataca 
.24301 accccaccaa tgtggaaata tctgacaacc ccaacaccta cgactacatg aacaagcgag 

24361 tggtggctcc tgggcttgta gactgctaca ttaaccttgg ggcgcgctgg tctctggact 
24421 acatggacaa cgttaatccc tttaaccacc accgcaatgc gggcctgcgt taccgctcca 
24481 tgttgttggg aaacggccgc tacgtgccct ttcacattca ggtgccccaa aagttttttg 
24541 ccattaaaaa cctcctcctc ctgccaggct catacacata tgaatggaac ttcaggaagg 
24601 atgttaacat ggttctgcag agctctctgg gaaacgacct tagagttgac ggggctagca 
24661 ttaagtttga cagcatttgt ctttacgcca ccttcttccc catggcccac aacacggcct 
24721 ccacgctgga agccatgctc agaaatgaca ccaacgacca gtcctttaat gactaccttt 
24781 ccgccgccaa catgctatat cccatacccg ccaacgccac caacgtgccc atctccatcc 
24841 catcgcgcaa ctgggcagca tttcgcggtt gggccttcac acgcttgaag acaaaggaaa 
24901 ccccttccct gggatcaggc tacgaccctt actacaccta ctctggctcc ataccatacc 
24961 ttgacggaac cttctatctt aatcacacct ttaagaaggt ggccattact tttgactctt 
25021 ctgttagctg gccgggcaac gaccgcctgc ttactcccaa tgagtttgag attaagcgct 
25081 cagttgacgg ggagggctat aacgtagctc agtgcaacat gacaaaggac tggttcctag 
25141 tgcagatgtt ggccaactac aatattggct accagggctt ctacattcca gaaagctaca 
25201 aagaccgcat gtactcgttc ttcagaaact tccagcccat gagccggcaa gtggtggacg 
25261 atactaaata caaagattat cagcaggttg gaattatcca ccagcataac aactcaggct 
25321 tcgtaggcta cctcgctccc accatgcgcg agggacaagc ttaccccgct aatgttccct 
25381 acccactaat aggcaaaacc gcggttgata gtattaccca gaaaaagttt ctttgcgacc 
25441 gcaccctgtg gcgcatcccc ttctccagta actttatgtc catgggtgcg ctcacagacc 
25501 tgggccaaaa ccttctctac. gcaaactccg cccacgcgct agacatgacc tttgaggtgg 
25561 atcccatgga cgagcccacc cttctttatg ttttgtttga agtctttgac gtggtccgtg 
25621 tgcaccagcc gcaccgcggc gtcatcgaga ccgtgtacct gcgcacgccc ttctcggccg 
25681 gcaacgccac aacataaaga agcaagcaac atcaacaaca gctgccgcca tgggctccag 
25741 tgagcaggaa ctgaaagcca ttgtcaaaga tcttggttgt gggccatatt ttttgggcac 
25801 ctatgacaag cgcttcccag gctttgtttc cccacacaag ctcgcctgcg ccatagttaa 
25861 cacggccggt cgcgagactg ggggcgtaca ctggatggcc tttgcctgga acccgcgctc 
25921 aaaaacatgc tacctctttg agccctttgg cttttctgac caacgtctca agcaggttta. 
, 25981 ccagtttgag tacgagtcac tcctgcgccg tagcgccatt gcctcttccc ccgaccgctg 
26041 tataacgctg gaaaagtcca cccaaagcgt gcaggggccc aactcggccg cctgtggcct 
26101 attctgctgc atgtttctcc acgcctttgc caactggccc caaactccca tggatcacaa 
26161 ccccaccatg aaccttatta ccggggtacc caactccatg cttaacagtc cccaggtaca 
26221 gcccaccctg cgccgcaacc aggaacagct ctacagcttc ctggagcgcc actcgcccta' 
26281 cttccgcagc cacagtgcgc aaattaggag cgccacttct ttttgtcact tgaaaaacat 
26341 gtaaaaataa tgtactagga gacactttca ataaaggcaa atgtttttat ttgtacactc 
26401 tcgggtgatt atttaccccc acccttgccg tctgcgccgt ttaaaaatca aaggggttct 
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26461 gccgcgcatc gctatgcgcc actggcaggg acacgttgcg atactggtgt ttagtgctcc 
26521 acttaaactc aggcacaacc atccgcggca gctcggtgaa gttttcactc cacaggctgc 
26581 gcaccatcac caacgcgttt agcaggtcgg gcgccgatat cttgaagtcg cagttggggc 
26641 ctccgccctg cgcgcgcgag ttgcgataca cagggttaca gcactggaac actatcagcg 
26701 ccgggtggtg cacgctggcc agcacgctct tgtcggagat cagatccgcg tccaggtcct 
26761 ccgcgttgct cagggcgaac ggagtcaact ttggtagctg ccttcccaaa aagggtgcat 
26821 gcccaggctt tgagttgcac tcgcaccgta gtggcatcag aaggtgaccg tgcccagtct 
26881 gggcgttagg atacagcgcc tgcatgaaag ccttgatctg cttaaaagcc acctgagcct 
26941 ttgcgccttc agagaagaac atgccgcaag acttgccgga aaactgattg gccggacagg 
27001 ccgcgtcatg cacgcagcac cttgcgtcgg tgttggagat ctgcaccaca tttcggcccc 
27061 accggttctt cacgatcttg gccttgctag actgctcctt cagcgcgcgc tgcccgtttt 
27121 cgctcgtcac atccatttca atcacgtgct ccttatttat cataatgctc ccgtgtagac 
27181 acttaagctc gccttcgatc tcagcgcagc ggtgcagcca caacgcgcag cccgtgggct 
27241 cgtggtgctt gtaggttacc tctgcaaacg actgcaggta cgcctgcagg aatcgcccca 
27301 tcatcgtcac aaaggtcttg ttgctggtga aggtcagctg caacccgcgg tgctcctcgt 
27361 ttagccaggt cttgcatacg gccgccagag cttccacttg gtcaggcagt agcttgaagt 
27421 ttgcctttag atcgttatcc acgtggtact tgtccatcaa cgcgcgcgca gcctccatgc 
27481 ccttctccca cgcagacacg atcggcaggc tcagcgggtt tatcaccgtg ctttcacttt 
27541 ccgcttcact ggactcttcc ttttcctctt gcatccgcat accccgcgcc actgggtcgt 
27601 cttcattcag ccgccgcacc gtgcgcttac ctcccttgcc gtgcttgatt agcaccggtg 
27661 ggttgctgaa acccaccatt tgtagcgcca catcttctct ttcttcctcg ctgtccacga 
27721 tcacctctgg ggatggcggg cgctcgggct tgggagaggg gcgcttcttt ttctttttgg 
27781 acgcaatggc caaatccgcc gtcgaggtcg atggccgcgg gctgggtgtg cgcggcacca 
27841 gcgcatcttg tgacgagtct tcttcgtcct cggactcgag acgccgcctc agccgctttt 
27901 ttgggggcgc gcggggaggc ggcggcgacg gcgacgggga cgagacgtcc tccatggttg 
27961 gtggacgtcg cgccgcaccg cgtccgcgct cgggggtggt ttcgcgctgc tcctcttccc 
28021 gactggccat ttccttctcc tataggcaga aaaagatcat ggagtcagtc gagaaggagg 
28081 acagcctaac cgcccccttt gagttcgcca ccaccgcctc caccgatgcc gccaacgcgc 
28141 ctaccacctt ccccgtcgag gcacccccgc ttgaggagga ggaagtgatt atcgagcagg 
28201 acccaggttt tgtaagcgaa gacgacgaag atcgctcagt accaacagag gataaaaagc 
28261 aagaccagga cgacgcagag gcaaacgagg aacaagtcgg gcggggggac caaaggcatg 
28321 gcgactacct agatgtggga gacgacgtgc tgttgaagca tctgcagcgc cagtgcgcca 
28381 ttatctgcga cgcgttgcaa gagcgcagcg atgtgcccct cgccatagcg gatgtcagcc 
28441 ttgcctacga acgccacctg ttctcaccgc gcgtaccccc caaacgccaa gaaaacggca 
28501 catgcgagcc caacccgcgc ctcaacttct accccgtatt tgccgtgcca gaggtgcttg 
28561 ccacctatca catctttttc caaaactgca agatacccct atcctgccgt gccaaccgca 
28621 gccgagcgga caagcagctg gccttgcggc agggcgctgt catacctgat atcgcctcgc 
28681 tcgacgaagt gccaaaaatc tttgagggtc ttggacgcga cgagaagcgc gcggcaaacg 
28741 ctctgcaaca agaaaacagc gaaaatgaaa gtcactgtgg agtgctggtg gaacttgagg 
28801 gtgacaacgc gcgcctagcc gtgctgaaac gcagcatcga ggtcacccac tttgcctacc 
28861 cggcacttaa cctacccccc aaggttatga gcacagtcat gagcgagctg atcgtgcgcc 
28921 gtgcacgacc cctggagagg gatgcaaact tgcaagaaca aaccgaggag ggcctacccg 
28981 cagttggcga tgagcagctg gcgcgctggc ttgagacgcg cgagcctgcc gacttggagg 
29041 agcgacgcaa gctaatgatg gccgcagtgc ttgttaccgt ggagcttgag tgcatgcagc 
29101 ggttctttgc tgacccggag atgcagcgca agctagagga aacgttgcac tacacctttc 
29161 gccagggcta cgtgcgccag gcctgcaaaa tttccaacgt ggagctctgc aacctggtct 
29221 cctaccttgg aattttgcac gaaaaccgcc ttgggcaaaa cgtgcttcat tccacgctca 
29281 agggcgaggc gcgccgcgac tacgtccgcg actgcgttta cttatttctg tgctacacct 
29341 ggcaaacggc catgggcgtg tggcagcagt gcctggagga gcgcaacctg aaggagctgc 
29401 agaagctgct aaagcaaaac ttgaaggacc tatggacggc cttcaacgag cgctccgtgg 
29461 ccgcgcacct ggcggacatt atcttccccg aacgcctgct taaaaccctg caacagggtc 
29521 tgccagactt caccagtcaa agcatgttgc aaaactttag gaactttatc ctagagcgtt 
29581 caggaattct gcccgccacc tgctgtgcgc ttcctagcga ctttgtgccc attaagtacc 
29641 gtgaatgccc tccgccgctt tggggtcact gctaccttct gcagctagcc aactaccttg 
29701 cctaccactc cgacatcatg gaagacgtga gcggtgacgg cctactggag tgtcactgtc 
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29761 gctgcaacct atgcaccccg caccgctccc tggtctgcaa ttcacaactg cttagcgaaa v 

29821 gtcaaattat cggtaccctt gagctgcagg gtccctcgcc tgacgaaaag tccgcggctc 

29881 cggggttgaa actcactccg gggctgtgga cgtcggctta ccttcgcaaa tttgtacctg 

29941 aggactacca cgcccacgag attaggttct acgaagacca atcccgcccg ccaaatgcgg 

30001 agcttaccgc ctgcgtcatt acccagggcc acatccttgg ccaattgcaa gccattaaca/ 

30061 aagcccgcca agagtttctg ctacgaaagg gacggggggt ttacttggac ccccagtccg 

30121 gcgaggagct caacccaatc cccccgccgc cgcagcccta tcagcagccg cgggcccttg 

30181 cttcccagga tggcacccaa aaagaagctg cagctgccgc cgccgccacc cacggacgag 

.30241 gaggaatact gggacagtca ggcagaggag gttttggacg aggaggagga gatgatggaa 

30301 gactgggaca gcctagacga ggaagcttcc gaggccgaag aggtgtcaga cgaaacaccg 

30361 tcaccctcgg tcgcattccc ctcgccggcg ccccagaaat cggcaaccgt tcccagcatt: 

30421 gctacaacct ccgctcctca ggcgccgccg gcactgcccg ttcgccgacc caaccgtaga . 

30481 tgggacacca ctggaaccag ggccggtaag tctaagcagc cgccgccgtt agcccaagag: 

30541 caacaacagc gccaaggcta ccgctcgtgg cgcgtgcaca agaacgccat agttgcttgc 

. 30601 ttgcaagact gtgggggcaa catctccttc gcccgccgct ttcttctcta ccatcacggc 

30661 gtggccttcc cccgtaacat cctgcattac taccgtcatc tctacagccc ctactgcacc 

30721 ggcggcagcg gcagcaacag cagcggccac gcagaagcaa aggcgaccgg atagcaagac 

30781 tctgacaaag cccaagaaat ccacagcggc ggcagcagca ggaggaggag cactgcgtct 

30841 ggcgcccaac gaacccgtat cgacccgcga gcttagaaac aggatzttttc. ccactctgta 
30901 tgctatattt caacagagca ggggccaaga acaagagctg aaaataaaaa acaggtctct 

30961 gcgctccctc acccgcagcc gcctgtatca caaaagcgaa gatcagcttc ggcgcacgct 

31021 ggaagacgcg gaggctctct tcagcaaata ctgcgcgctg actcttaagg actagtttcg 
31081 cgccctttct caaatttaag cgcgaaaact acgtcatctc cagcggccac acccggcgcc 

31141 agcacctgtc gtcagcgcca ttatgagcaa ggaaattccc acgccctaca tgtggagtta 
31201 ccagccacaa atgggacttg cggctggagc tgcccaagac tactcaaccc gaataaacta 

31261 catgagcgcg ggaccccaca tgatatcccg ggtcaacgga atccgcgccc accgaaaccg 

31321 aattctcctc gaacaggcgg ctattaccac cacacctcgt aataacctta atccccgtag 

31381 ttggcccgct gccctggtgt accaggaaag tcccgctccc accactgtgg tacttcccag 

31441 agacgcccag gccgaagttc agatgactaa ctcaggggcg cagcttgcgg gcggctttcg 

31501 tcacagggtg cggtcgcccg ggcagggtat aactcacctg aaaatcagag ggcgaggtat 
31561 tcagctcaac gacgagtcgg tgagctcctc tcttggtctc cgtccggacg ggacatttca 

31621 gatcggcggc gctggccgct cttcatttac gccccgtcag gcgatcctaa ctctgcagac 

31681 ctcgtcctcg gagccgcgct ccggaggcat tggaactcta caatttattg aggagttcgt 

31741 gccttcggtt tacttcaacc ccttttctgg acctcccggc cactacccgg accagtttat 

31801 tcccaacttt gacgcggtaa aagactcggc ggacggctac gactgaatga ccagtggaga 

31861 ggcagagcaa ctgcgcctga cacacctcga ccactgccgc cgccacaagt gctttgcccg 

31921 cggctccggt gagttttgtt actttgaatt gcccgaagag catatcgagg gcccggcgca 

31981 cggcgtccgg ctcaccaccc aggtagagct tacacgtagc ctgattcggg agtttaccaa 

32041 gcgccccctg ctagtggagc gggagcgggg tccctgtgtt ctgaccgtgg tttgcaactg 

32101 tcctaaccct ggattacatc aagatcttat tccattcaac taacaataaa cacacaataa 

32161 attacttact taaaatcagt cagcaaatct ttgtccagct tattcagcat cacctccttt 

32221 ccctcctccc aactctggta tttcagcagc cttttagctg cgaactttct ccaaagtcta 

32281 aatgggatgt caaattcctc atgttcttgt ccctccgcac ccactatctt catattgttg 

32341 cagatgaaac gcgccagacc gtctgaagac accttcaacc ctgtgtaccc atatgacacg 

32401 gaaaccggcc ctccaactgt gcctttcctt acccctccct ttgtgtcgcc aaatgggttc 

32461 caagaaagtc cccccggagt gctttctttg cgtctttcag aacctttggt tacctcacac 

32521 ggcatgcttg cgctaaaaat gggcagcggc ctgtccctgg atcaggcagg caaccttaca 

32581 tcaaatacaa tcactgtttc tcaaccgcta aaaaaaacaa agtccaatat aactttggaa 

32641 acatccgcgc cccttacagt cagctcaggc gccctaacca tggccacaac ttcgcctttg 

32701 gtggtctctg acaacactct taccatgcaa tcacaagcac cgctaaccgt gcaagactca 

32761 aaacttagca ttgctaccaa agagccactt acagtgttag atggaaaact ggccctgcag 

32821 acatcagccc ccctctctgc cactgataac aacgccctca ctatcactgc ctcacctcct 

32881 cttactactg caaatggtag tctggctgtt accatggaaa acccacttta caacaacaat 

32941 ggaaaacttg ggctcaaaat tggcggtcct ttgcaagtgg ccaccgactc acatgcacta. 

33001 acactaggta ctggtcaggg ggttgcagtt cataacaatt tgctacatac aaaagttaca 
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aaactccttc 
gccaacctac 
ccatgttttt 
tgaacgcgct 
tttgtaagat 
taaaggctaa 
cccaaataat 
agtccggcca 
atcatgattg 
taacaaaaat 
ggtctgcacg 
tgattatgac 
gcatgggcgg 
aaaaagaaag 
ccacagaaaa 
aataaaataa 
ccttataagc 



aaactggaga 
taaataccac 
ggttggaatt 
gcatacaata 
acagctccgg 
caccagaccc 
cgctaacaaa 
atatggcctc 
acggagtgct 
actccactaa 
acccaaaaac 
gtgacaagtc 
aagtaagcaa 
acaaatttgc 
acctgttgca 
ttttcattca 
caaactcaca 
gtcctttctc 
ggtgttatat 
tccccgggca 
ccaacttgcg 
tcataatcgt 
cgccgccgct 
accgcccgca 
aagtcagcac 
gcgctgtatc 
cgcaggtaga 
ggcatgttgt 
tccaccacca 
ccgggactgg 
gtcatgatat 
agctcctccc 
cccacactgc 
tcgggcagca 
agacgatccc 
atgccaaatg 
acaaacagat 
tatccactct 
atgcgccgct 
acattcgttc 
ttttttattc 
cccctccggt 
gttgcacaat 
acccttcagg 
tctcatctcg 
ttgtaaaaat 
caaaaattca 
accgcgatcc 
gaccagcgcg 
acgcatactc 
cgatataaaa 
cacatcgtag 
agacaccatt 
caaaaaaaca 
ataagacgga 



tggcctctat 
aaaaggcctt 
tgaaacagac 
taataccaat 
agccataaca 
atccccaaat . 
atgtggcagt. 
catcaatgga 
tatgtcaaat 
cggtcaacca 
tcaaagtaaa 
taaaccattg 
atactcaata 
caccaattcc 
tgttatgttt 
gtagtatagc 
gaaccctagt 
cccggctggc 
tccacacggt 
gctcgcttaa 
gttgctcaac . 
gcatcaggat 
ccgtcctgca 
gcataaggcg 
agtaactgca 
caaagctcat 
ttaagtggcg 
aattcaccac 
tcctaaacca 
aacaatgaca 
caatgttggc 
gcgtcagaac 
agggaagacc 
gcggatgatc 
tactgtacgg 
gaacgccgga 
ctgcgtctcc 
ctcaaagcat 
gccctgataa 
tgcgagtcac 
caaaagatta 
ggcgtggtca 
ggcttccaaa 
gtgaatctcc 
ccaccttctc 
ctgctccaga 
ggttcctcac 
cgtaggtccc 
gccacttccc 
ggagctatgc 
tgcaaggtgc 
tcatgctcat 
tttctctcaa 
tttaaacatt 
ctacggccat 
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36361 gccggcgtga ccgtaaaaaa actggtcacc 
36421 ggtcatgtcc ggagtcataa tgtaagactc 
36481 cagtgctaaa aagcgaccga aatagcccgg 
36541 cattacagcc cccataggag gtataacaaa 
36601 tgaaaaaccc tcctgcctag gcaaaatagc 
36661 ttccacagcg gcagccataa cagtcagcct 
36721 acaccactcg acacggcacc agctcaatca 
36781 gcgagtatat ataggactaa aaaatgacgt 
36841 aaaccgcacg cgaacctacg cccagaaacg 
36901 cgtcacttcc gttttcccac gttacgtcac 
36961 cacatacaag ttactccgcc ctaaaaccta 
37021 cacgtcacaa actccacccc ctcattatca 
37081 attgatgatg 




gtgattaaaa agcaccaccg acagctcctc ;^;.v :f 
ggtaaacaca tcaggttgat tcacatcggt 
gggaatacat acccgcaggc gtagagacaa c 
attaatagga gagaaaaaca cataaacacc 
accctcccgc tccagaacaa catacagcgc- . 
taccagtaaa aaagaaaacc tattaaaaaa A 
gtcacagtgt aaaaaagggc caagtgcaga 
aacggttaaa gtccacaaaa aacacccaga - 
aaagccaaaa aacccacaac ttcctcaaat : 
ttcccatttt aagaaaacta caattcccaa.^'^f; 
cgtcacccgc cccgttccca cgccccgcgc . • ; * 
tattggcttc aatccaaaat aaggtatatt . ; ' 
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10 30 50 ~n 

ATGGCGCCCATCACGGCCTACTCCCAACAGACGCGGGGCCTACTTGGTTGCATCATCACT 



-+ 



MetAlaProIleThrAlaTyrSerGlnGlnThrArgGlyLeuLeuGlyCysHelleThr 

10 20 

70 90 HO 



AGCCTT. 



ACAGGCCGGGACAAGAACCAGGTCGAGGGAGAGGTTCAGGTGGTTTCCACCGCA 



. + + + + 



+ + 

SerLeuThrGlyArgAspLysAsnGlnValGluGlyGluValGlnValValSerThrAla 

30 40 

130 150 170 

ACACAATCCTTCCTGGCGACCTTCGTCAACGGCGTGTGTTGGACCGTTTACCATGGTGCT 

+ + + + + + 

ThrGlnSerPheLeuAlaThrCysValAsnGlyValCysTrpThrValTyrHisGlyAla 

50 60 

190 210 230 

GGCTCAAAGACCTTAGCCGGCCCAAAGGGGCCAATCACCCAGATGTACACTAATGTGGAC 

♦ + + * + — + 

GlySerLysThrLeuAlaGlyProLysGlyProIleThrGlnMetTyrThrAsnValAsp 

70 80 

250 270 290 

CAGGACCTCGTCGGCTGGCAGGCGCCCCCCGGGGCGCGTTCCTTGACACCATGCACCTGT 

♦ + -- + +" + + 

GlnAspLeuValGlyTrpGliiAlaProProGlyAlaArgSerLeuThrProCysThrCys 

90 100 

310 330 350 

GGCAGCTCAGACCTTTACTTGGTCACGAGACATGCTGACGTCATTCCGGTGCGCCGGCGG 

__ J + + + + 

GlySerSerAspLeuTyrLeuValThrArgHisAlaAspVallleProValArgArgArg 

110 "0 

370 390 410 

GGCGACAGTAGGGGGAGCCTGCTCTCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCTTCG 

, + + + + + 

GlyAspSerArgGlySerLeuLeuSerProArgProValSerTyrLeuLysGlySerSer 

130 I 40 
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430 450 470 

GGTGGTCCACT<XTCTGCCCTTCGGGGCACG^ • 

GlyGiyProLe\iLeuCysProSerGlyHisAlaValGlyIlePheArgAlaAlaValCys 

150 160 

490 510 530 

ACCCGGGGGGTTGCGAAGGCGGTG<^CTTTCTG^ 

+ 1 + +' + »- r- + 

ThrArgGlyValAlaLysAiaValAspPheValProValGluSerMetGluThrThrMet 

170 180 

550 570 590 

CGGTCTCCGGTCTTCACTCACAACTC^TC^^ 

+ . + + 1 — + — : — + 

ArgSerProValPheThrAspAsnSerSerProProAlaValProGlnSer^heGlnVal 

190 200 

610 630 650 

GCCCACCTACACG<^CCACi 

; + + * - + + +• + 

AlaHisLeuHisAlaPrbThrGlySerGlyLysSerThrLysValProAlaAlaTyrAla. 

210 220 

670 690 710 



AlaGinGiyTyrtysValteuValLeuAsnPrd 

230 240 

730 750 770 

GCGTATATGTCTAAGGCACACGGTATTCACCCCAACAT 

AlaTyrMetSerLysAlaHisGlylieAspProAsrilleArgThrGlyValArgThrlle 

250 ' 260 

790 810 830 

ACCACA^CGCCCCCGtCAGATAC 

ThrThrGlyAlaProValThrlVrSerThrTyrt 

270 280 
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850 870 B90 

TC TGGGGGCGCTTATGAC ATCATAATATGTGATGAGTGCCATTC AACTGACTC GACTACA 

. + + - ♦ + 

SerGlyGlyAlaTyrAspIlellelleCysAspGluCysHisSerThrAspSerThrThr 

290 300 

910 930 950 

ATCTTGGGCATCGGCACAGTCCTGGACCAAGCGGAGACGGCTC 

. + ^ — + + + + + 

IlelieuGlylleGlyThrValLeuAspGlnAlaGluThrAlaGlyAlaArgLeuValVal 

310 320 

970 . 9^0 1010 

CTCGCCACCGCTACGCCTCCGGGATCGGTCACCGTGCCACACCCAAACATCGAGGAGGTG 

+ + + 

LeuAlaThrAlaThrProProGlySerValThrValProHisProAsnlleGluGluVal 

330 340 

1030 1050 1070 

GCCCTGTCTAATACTGGAGAGATCCCCTTC^ 

. + + + + 

AlaLeuSerAsnThrGlyGluIleProPheTyrGlyLysAlalleProIleGluAlalle 

350 360 

1090 HIO H30 

AGGGGGGGAAGGCATCTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTC 

. + + + + 

ArgGlyGlyArgHisLeuIlePheCysHisSerLysLysLysCysAspGluLetiAlaAla 

370 380 

1150 H70 H90 

AAGCTGTCAGGCCTCGGAATCAACGCTGTGGCGTATTACCGGGGGCTCGATGTGTCCGTC 

. + + + + + 

LysLeuSerGlyLeuGlylleAsnAlaValAlaTyrTyrArgGlyLeuAspValSerVal 

390 400 

1210 1230 1250 

ATACCAACTATCGGAGACGTCGTTGTCGTGGCAACAGACGCTCTGATGACGGGCTATACG 

+ _ + + + + 

1 1 e Pr oThr 1 1 eGlyAspVa 1 ValVa lVa 1A1 aThr AspAlaLeuMe t Thr GlyTy rThr 

410 4 20 
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1270 1290 1310 

ox:gactttgactcagtgatc(^ctc^ 

+ + ± — + + + 

GlyAspPheAspSerVallleAspCysAsnThrCysValThrGlnThrValAspPheSer 

430 440 

1330 1350 1370 

TTGGATCCCACCTTCACCATIXSAGAC 

' + + : + + 

LetiAspPrbThrPheThrlleGluThrThrThrValProGlnAspAIaValSerArgSer 

450 460 

1390 1410 1430 

CAGCGGCGGGGTAGGACTGGCAGGGGTAGGAGAGGCATCTACAGGTTTC 

+ , + + + + + 

GlnArgArgGlyArgThrGlyArgGlyArgArgGlyl 1 eTyrArgPheValThr ProGly 

470 480 

1450 1470 1490 

GAACGGCCCTCGGGCATGTTCGAiTCCTCGGTCCTGTG 

: + - + i +^_^_- ^ + : -.--J + -,i + 

GluArgProSerGlyMetPheAspSerSerValLeuCysGluCysTyrAspAlaGlyCys 

490 500 

1510 1530 1550 

GCTPTGOTAGGiAGCTCACCCGCGCCGAGACCTCGGTTAGGTTGCGGGGCTACCTGAACAGA 

— ^- — ^--i.* + — + + + 

AlaTrpTyrGluLeuThrProAlaGluThr SerVal ArgLeuArgAl aTyrLeuAsnThr 

510 520 

1570 1590 1610 

CCAGGGTTGCCCGTTTGCGAGGACCACCTGGAGTTCTCGGA 

.+ — — - + + 

PrbGlyLeuProValCysGlnAspHisLeuGluPheT^ 

530 540 

1630 1650 1670 

ACCGACATAGATGCACACTTCTTGTCCCAGA 

ThrHislieAspAlaHisPheLeuSerGinThrtiysGlxiAlaGlyAspAsnPheProTyr 

550 560 
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1690 1710 * 730 

CTGGTAGCATACCAAGCCACGGTGTGCGCCAGGGCTCAGGCCCCACCTCCATCATGGGAT 



-+- 



.+ + + 



LeuValAlaTyrGlnAlaThrValCysAlaArgAlaGlnAlaProProProSerrrpAsp 

570 580 

1750 1770 l" 790 

CAAATCTGGAAGTGTCTCATACGGCTGAAACCTACGCTGCACGGGCCAACACCCTTGCTG 

+ + * + + + 

GlnMetTrpLysCysteuIleArgLeuLysProThrLeuHisGlyProThrProLeuLeu 

590 600 

1810 1830 "50 

TACAGGCTGGGAGCCGTCCAAAATGAGGTCACCCTCACCCACCCCATAACCAAATACATC 

.. + — - + + * — + + 

TyrArgLeuGlyAlaValGlnAsnGluValThrLeuThrHisProIleThrLysTyrlle 

610 620 

1870 1890 1910 

ATGGCATGCATGTCGGCTGACCTGGAGGTCGTCACTAGCACCTGGGTGCTGGTGGGCGGA 

.+ + + * + + 



MetAlaCysMetSerAlaAspLeuGluValValThrSerThrTrpValLeuValGlyGly 

630 * 40 

1930 1950 1970 

GTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACAGGCAGTGTGGTCATTGTGGGTAGG 

+ + + + ♦ + 

ValLeuAlaAlateuAlaAlaTyrCysLeuThrThrGlySerValVallleValGlyArg 

650 660 

1990 2010 2030 

ATTATCTTGTCCGGGAGGCCGGCTATTGTTCCCGACAGGGAGTTTCTCTACCAGGAGTTC 

+ + + + ~* + 

IlelleLeuSerGlyArgProAlalleValProAspArgGluPheLeuTyrGlnGluPhe 

670 6 80 

2050 2070 2090 

GATGAAATGGAAGAGTGCGCCTCGCACCTCCCTTACATCGAGCAGGGAATGCAGCTCGCC 

♦ + * ♦ * + 

AspGluMetGluGluCysAlaSerHisLeuProTyrlleGluGlnGlyMetGlnLeuAla 

690 700 
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2110 2130 2150 

GAGCAATTCAAGGAGLAAAGCGCTC 

+ ^ + + + •+ + 

GluGlnPhel/ysGlnLysAlaLeuGlyLeuLeuGlnThrAlaThrLysGlnAlaGluAla 

710 720 

2170 2190 2210 

GCTGCTCCCGTGGTGGAGTCCAAGTGGCGAGCCCTTGAGACATTCTGGGCGAAGCACATC 

. +. + + + : + - + 

AlaAlaProValValGluSerLysTrpArgAlaLeuGluThrPheTrpAlaLysHisMet 

730 740 

2230 2250 2270 

TGGAATTTCATCAGCGGGATACAGTACTTAGCAGGCTTATCCACTGTGCCTGGGAACCCC 

+ + + + + + 

TrpAsnPhelleSerGlylleGlnTyrLeuAlaGlyLeuSerThrUeuProGlyAsnPro 

750 760 

2290 2310 2330 

GCAATAGCATCATTK1ATGGCATTCACAGCCTC 

r +- — -. + + + . — + 

AlalleAlaSerLetiMetAlaPheThrAlaSerlleThrSerProLeuThrThrGlnSer 

770 780 

2350 2370 2390 

ACCCTCCTCTTTAACATCTTGG 

i.-^ + — + + - . — + 

ThrLeuLeuPheAsnlleLeuGlyGlyTrpValAlaAlaGlnLeuAlaProProSerAla 

790 800 

2410 2430 2450 

GCTTCGGCTTTCGTGGGCGCCGGCATCGCCGGTGC 

+ .+ + +■ + -+ 

AlaSerAlaPheValGlyAlaGlylleAlaGlyAlaAlaValGlySerrleGlyLeuGly 

810 820 

2470 2490 2510 . 

AAGGTKXriTGTGGACATTCTG^^ 

; + + + + + 

LysValLeuValAspIleLeuAlaGlyTyrGlyAlaGlyValAlaGlyAlaLeuValAla 

830 840 
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2530 2550 2570 

TTCAAGGTCATGAGCGGCGAGATGC 

+ + + + * — + . 

PheLysValMetSerGlyGluMetProSerThrGluAspLeuValAsnLeuUeuProAla 

850 860 

2590 2610 2630 

ATCCTCTCTCCTGGCGCCCTGGTCGTC 

+ + + + + 

IleLeuSerProGlyAlaLeuValValGlyValValCysAlaAlalleLeuArgArgHis 

870 880 

2650 2670 2690 

GTGGGTCCGGGAGAGGGGGCTGTGCAGTGGATGAACCGGCTGATAGCGTTCGCCTCGCGG 

. + + + + + 

ValGlyProGlyGluGlyAlaValGlnTrpMetAsnArgLeuIleAlaPheAlaSerArg 

890 900 

2710 2730 .2750 

GGTAATCATGTTTCCCCCACGCACTATGTGCCTGAGAGCGACGCCGCAGCGCGTGTTACT 

+ + + + + 

GlyAsnHisValSerProThrHisTyrValProGluSerAspAlaAlaAlaArgValThr 

910 920 

2770 2790 2810 

CAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGGCTCCACCAGTGGATTAAT 

+ + + + + 

GlnlleLeuSerSerLeuThrlleThrGlnLeuIieuLysArgLeuHisGlnTrpIleAsn 

930 940 

2830 2850 2870 

GAAGACTGCTCCACACCGTGTTCCGGCTCGTGGCTAA 

-+ + + + + 

GluAspCysSerThrPrc<VsSerGlyS 

950 960 

2890 2910 2930 

ACGGTGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCAGCTACC 

__ __ _, + + + + + 

ThrValLeuThrAspPheLysThrTrpI^uGlnSerLysLeuLeuProGlnLeuProGly 

970 980 
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2950 2970 2990 

GTCCCTTTTTTCTCGTGC GAACGCGGGTAC AAGGGAGTCTGGC GGGGAGACGGCATC ATG 

+ + + '•• + 

ValProPhePheSerCysGlnArgGlyTyrLysGlyValTrpArgGlyAspGlylleMet 

990 1000 

3010 3030 3050 

CAAACCACCTGCCCATGTG<^G£ACAGAT^ 

+ + - — + + + + 

GlnThrThrCysProCysGlyAlaGlnlleThrGlyHisValliysAsnGlySerMetArg 

1010 1020 

3070 3090 3110 
ATCGTCGGGCCTAAGACCTCCAGCAACACGTGG : 
: — : — - + + - r -+ + 

IleValGlyProLysThrCysSerAsnThrTrpHisGlyThrPheProIleAsnAlaTyr 

1030 1040 

3130 3150 3170 

ACCACGGGCCCCTGCACACCCTC 

— + + + --: + + 

ThrThrGlyProCysThrProSerProAlaProAsnTyrSerArgAlaLeuTrpArgVal 

1050 1060 

3190 3210 3230 

GCCC^TGAGGAGTACGTGGAGG 

+ +- — + + + — — -< — 

AlaAlaGluGluTyrValGluValThrArgValGlyAspPheHisTyrValThrGlyMet 

1070 1080 

3250 3270 3290 

ACCACTGACAACGTAAAGTGCCCATGCCAGGTTCCGGCTC 

+ 1 + — . -+ + + ■--+ 

ThrThrAspAsnValLysCysProC^sGlnValProAlaProGluPhePheThrGluVal 

1090 1100 

3310 3330 3350 • 
GACGGAGTCCGGTTGCACAGGTACGCTCCGGCGTGCAGGCCTCTCCTACGGGAGGAGGOT 
* a — — + + + +_ s + ; + 

AspGlyValArgLeuHisArgTyxAlaProAlaCysArgProLeuljeuArgGluGiuVal 

1110 1120 
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3370 3390 3410 

ACATTCCAGGTCGGGCTCAACCAATACCTGGTTGGGTCACAGCTACCATGCGA 

+ + + + + + 

ThrPheGlnValGlyLeuAsnGlnTyrLeuValGlySerGlnLeuProCysGluProGlu 

1130 1140 

3430 3450 3470 

CCGGATGTAGCAGTGCTCACTTCCATGCTCACCGACCCCTCCCACATCACAGCAGAAACG 

+ + + + + + 

ProAspValAlaValLeuThrSerMetLeuThrAspProSerHisIleThrAlaGluThr 

1150 1160 

3490 3510 3530 

GCTAAGCGTAGGTTGGCCAGGGGGTCTCCCCCCTCCTTGGCCAGCTCTTCAGCTAGCCAG 

+ + + + + + 

AlabysArgAxgLeuAlaArgGlySerProProSerLeuAlaSerSerSerAlaSerGln 

1170 1180 

3550 3570 3590 

TTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCACCATGTCTCTCCGGACGCTGAC 

+ + + + + + 

LeuSerAlaProSerLeuLysAlaThrCysThrThrHisHisValSerProAspAlaAsp 

1190 1200 

3610 3630 3650 

CTCATCGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGGAACATCACCCGCGTGGAG 

+ + + + + + 

LeuIleGluAlaAsnLeuI/euTrpArgGlnGluMetGlyGlyAsnlleThrArgValGlu 

1210 1220 

3670 3690 3710 

TCGGAGAACAAGGTGGTAGTCCTGGACTCTTTCGACCCGC 

+ „._ - + + + + 

SerGluAsnLysValValValLeuAspSerPheAspProLexiArgAlaGluGluAspGlu 

1230 1240 

3730 3750 3770 

AGGGAAGTATCCGTTCCGGCGGAGATCCTGCGGAAATCCAAGAAGTTCCCCGCAGCGATG 

4. + + + + + 

ArgGluValSerValProAlaGluIleLeiaArgLysSerLysLysPheProAlaAlaMet 

1250 1260 
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3790 3810 3830 

CCCATCTGGGCGCGCCCGGATTACAACCCTCCACTGTTAGAGTCCTGGAAGGACCCGGAC 

__- + + + + + • + 

ProIleTrpAlaArgProAspTyrAsnProProLeuLeuGluSerTrpLysAspProAsp 

1270 1280 

3850 3870 3890 

TACGTCCCTCCGGTGGTGCACGGGTGCCCGTTGCCACCTATCAAGGCCCCTCCAATACCA 

4- + + - + + = + 

TyrValProProValValHisGlyCysProLeuProProIleLysAlaProProllePro 

1290 1300 

3910 3930 3950 

CCTCCACGGAGAAAGAGGACGGTTGTCCTAACAG 

— +. + — + + — ■ + -+ 

ProPrbArgArgLysArgThrValValLeuThrGluSerSerValSerSerAlaliexiAla 

1310 1320 

3970 3990 4010 
GAGCTCGCTACTAAGACCiTCGGCAGCTCCGAATCATCGGCCGTCGA 
^ — + — , — + — !- u — + + + 

GluLeuAlaThrLysThrPheGlySerSerGluSerSerAlaValAspSerGlyThrAla. 

1330 1340 

4030 4050 4070 

ACCGCGCTTCCTGACCAG<^CTCCGACGACGGTGACAAAGGATC 

+ + +: — + 

ThrAlaLeuProAspGlnAlaSerAspAspGlyAspLysGlySerAspValGluSerTyr 

1350 1360 

. 4090 4110 4130 

TCCTCCATGCCCCCCCTTGAGGGGGAACCGGGGGACC^ 

----^ — -__^--^+_ . + „ + - + + 

SerSerMetProProLeuGiuGlyGluProGlyAspProAspLeuSerAspGlySerTrp 

1370 1380 

4150 4170 4190 

TCTACCGTGAC^GAGGAAGCTAGTGAGGA 

— — +--- — + + ♦ 

Ser^ItoValSerGluGluAlaSerGlxiAspValValC^sC^sSerMetSerTyrThr^ 

1390 . . 1400 
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4210 4230 4250 

ACAGGCGCCTTGATCACGCCATGCGCTGCGGA^ 

^ + + + + + 

ThrGlyAlaLeuIleThrProCysAlaAlaGluGluSerLysLeuProlleAsnAlaLeu 

1410 1420 

4270 4290 4310 

AGCAACTCTTTGCTGCGCCACCATAACATGGTTTATGCCACAACATC 

+ + + + + 

SerAsnSerLeviLetiArgHisHisAsnMetvalTyrAlaThrThrSerArgSerAlaGly 

1430 1440 

4330 4350 4370 

CTGCGGCAGAAGAAGGTCAC C TTTGACAGACTGCAAGTCCTGGACGACCACTACCGGGAC 

_ + + + + 4- 

LeuArgGlnLysLysValThrPheAspArgLeuGlnValLeuAspAspHisTyxArgAsp 

1450 1460 

4390 4410 4430 

GTGCTCAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAAACTCCTATCCGTAGAG 

+ + + + + + 

ValLeuLysGluMetLysAlaiysAlaSerThrValLysAlaLysLeuLeuSerValGlu 

1470 1480 

4450 4470 4490 

GAAGCCTGCAAGCTGACGCCCCCACATTCGGCCAAATCCAAGTTTGGCTATGGGGCAAAG 

m + - + — + 

GluAlaCysLysLeuThrProProHisSerAlaLysSerLysPheGlyTyrGlyAlaLys 

1490 I 500 

4510 4530 4550 

GACGTCCGGAACCTATCCAGCAAGGCCGTTAACCACATCCACTCCGTGTGGAAGGACTTG 

+ + + + + 

AspValArgAsnLeuSerSerLysAlaValAsnHisIleHisSerValTrpLysAspLeu 

1510 1520 

4570 4590 4610 

CTGGAAGACACTGTGACACCAATTGACACCACCATCATGG 



- + 



LeuGluAspThrValThrProIleAspThrThrlleMetAlaLysAsnGluValPheCys 

1530 1540 
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4630 4650 4670 

GTCCAACCAGAGAAAGGAGGCCGTAAGCCAGCCCGCCTTATCGTATTCCCAGATCTGGGA 

+ + + + + . 

ValGlnProGluLysGlyGlyArgLysProAlaArgLeuIleValPheProAspLeuGly : 

1550 1560 

4690 4710 4730 

GTCCGTGTATGCGAGAAGATGGCCCTCTATGATGTGGTC 

. + + ; \ \- + + + 

ValArgValCysGluLysMetAlaLeuTyrAspValValSerThrLeuProGlnValVal 

1570 1580 

4750 4770 4790 

ATGGGCTCCTCATACGGATTCCAGTACTCTCCTGGGCAGCGAGTCGAGTTCCTGGTGAAT 

_ + + + . + + . + 

MetGlySerSerTyrGlyPheGlnTyrSerProGlyGlnArgValGluPheLeuValAsn 

1590 1600 . 

4810 4830 4850 ■ • 

ACCTGGAAATCAAAGAAAAACCCCATGGGCTTTTCAT^ 

+ + + — : + — + 

ThrTrpLysSerLysLysAsnProMetGlyPheSertyrAspThrArgCysPheAspSer 

.1610 1620 

4870 4890 4910 

ACGGTCACCGAGAACGACATCCGTGTTGAGGAGTCAATTTA 

+ + _ + -+ + + 

ThrValThrGluAsnAspIleArgValGluGluSerIleTy^lnCy?CysAspLeuAla 

1630 ' 1640 

4930 4950 4970 

CCCGAAGCCAGACAGGCCATAAAATCGCTCACAGAGCGGCTTTATATCGGGGGTCCTCTG 

+ + + _ + + 

ProGluAlaArgGinAlalleLysSerLeuThrGluArgLeuTyrlleGiyGlyProLeu 

1650 1660 

4990 5010 5030 

ACTAATTCAAAAGGGCAGAACTGCGGTTATCGCCGGTGCC 

+ . + + — + + . 

Thr AsnSerLysGlyGlnAsnCysGlyTyrArgArgCys ArgAlaSerGlyValLeuThr 

1670 1680 
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5050 5070 5090 

ACTAGCTGCGGTAACACCCTCACATGTTACTTGAAGGCCTCTGCAGCCTGTCGAGCTGCG 

- - + — * + + 

ThrSerCysGlyAsnThrLeuThrCysTyrLeuLysAlaSerAlaAlaCysArgAlaAla 

1690 I 700 



5110 5130 5150 

AAGCTCCAGGACTGCACGATGCTCGTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGC 

+ + ♦ ♦ + + 

LysLeuGlnAspCysThrMetLeuValAsnGlyAspAspLeuValVallleCysGluSer 

1710 1720 



5170 5190 5210 

GCGGGAACCCAAGAGGACGCGGCGAGCCTACGAGTCTTCACGGAGGCTATGACTAGGTAC 

+ + + + + + 

AlaGlyThrGlnGluAspAlaAlaSerLeuArgValPheThrGluAlaMetThrArgTyr 

1730 1740 

5230 5250 5270 

TCTGCCCCCCCCGGGGACCCGCCCCAACCAGAATACGACTTGGAGCTGATAACATCATGT 

_ A ___ H + + + 

SerAlaProProGlyAspProProGlnProGluTyrAspLeuGluLeuIleThrSerCys 

1750 1760 

5290 5310 5330 

TCCTCCAATGTGTCGGTCGCCCACGATGCATCAGGCAAAAGGGTGTACTACCTCACCCGT 

+ + -+ + + 

SerSerAsnValSerValAlaHisAspAlaSerGlyLysArgValTyrTyrLeuThrArg 

1770 I 780 

5350 5370 5390 

GATCCCACCACCCCCCTCGCACGGGCTGCGTGGGAAACAGCTAGACACACTCCAGTTAAC 

+ + + + 

AspProThrThrProLeuAlaArgAlaAlaTrpGluThrAlaArgHisThrProValAsn 

1790 1800 

5410 5430 5450 

TCCTGGCTAGGCAACATTATCATGTATGCGCCCACTTTGTGGGCAAGGATGATTCTGATG 

+ +- + 

SerTrpLeuGlyAsnllelleMetTyrAlaProThrLeuTrpAlaArgMetlleLeuMet 

1810 1820 
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5470 5490 5510 

ACTCACTTCTTCTCCATCCTTCTAGCACAGGAGCAAC™ 

+ + + - - + - + I ~ + 

ThrHisPhePheSerIleLeuLeuAlaGlnGluGlnIi0uGlut.ysAlaLeiiAspCysGln 

1830 1840 

5530 5550 5570 

ATCTAC GGGGC CTGTTACTCC ATTGAGCC ACTTGACC TACCTC AGATCATTGAAC 

+ + + . + + — — . + 

IleTyrGlyAlaCysTyrSerlleGluProLeuAspLeuProGlnllelleGluArgLeu 

1850 I860 

5590 5610 5630 

CATGGCCTTAGCGCATTTTCACTCCATAGTTAC 

; + + + + --- + + 

HisGlyLeuSerAlaPheSerLeuHisSerTyrSerProGlyGluXleAsnArgValAla . 

1870 1880 

5650 5670 5690 

TCATGCCTCAGGAAACTTGGGGTACCACCCTT^ 

+ h . + . + — - — — + 

SerCysLeuArgLysLeuGlyValProProLeiaArgValTrpArgHisArgAlaArgSer 

1890 1900 

5710 5730 5750 

GTCCGCGCTAGGCTACTGTCCCAGGGGGGGAGGGCCGCCA^ 

L + . + ; + + - — + + 

ValArgAlaArgLexoLeuSerGinGlyGlyArgAlaAlaThrCysGlyLysTyrLeuPhe 

1910 1920 

5770 5790 r 5810 

AACTGGGC AGTGAAGACCAAACTC AAACTC ACTC C AATCCCGGCTGCGTCCC AGCTGGAC 

_w _ + , + _ . + - — + ^- + ,J- + 

AsnTrpAlaValLysThrLysLeuLysLeuThrProIleProAlaAlaSerGlnLeuAsip 

1930 1940 

5830 5850 5870 

TTGTCCGGCTGGTTCGTTGCTCGTTACAGCGGGGGAGACATATATC 

+ _ + y + ■ + — + 

LeuSerGlyTrpPheValAlaGlyTyrSerGlyGlyAspIleTyrHisSerLeuSerArg 

1950 I960 
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5890 5910 5930 

GCCCGACCCCGCTGGTTCATGCTGTGCCTACTCCTACTTTCTGTAGGGGTAGGCATCTAC 



AlaArgProArgTrpPheMetLeuCysLeuLeuLeutieuSerValGlyValGlylleTyr 

1970 1980 

5950 5955 
CTGCTCCCCAACCGA (SEQ. ID. NO. 5) 



LeuLeuProAsnArg (SEQ. ID. NO. 6) 
1985 
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1 TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG 

51 GAGACGGTCA CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG 

101 TCAGGGCGCG TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG 

151 CGGCATCAGA GCAGATTGTA CTGAGAGTGC ACCATATGCG GTGTGAAATA 

201 CCGCACAGAT GCGTAAGGAG AAAATACCGC ATCAGATTGG CTATTGGCCA 

251 TTGCATACGT TGTATCCATA TCATAATATG TACATTTATA TTGGCTCATG 

301 TCCAACATTA CCGCCATGTT GACATTGATT ATTGACTAGT TATTAATAGT 

351 AATCAATTAC GGGGTCATTA GTTCATAGCC CATATATGGA GTTCCGCGTT 

401 ACATAACTTA CGGTAAATGG CCCGCCTGGC TGACCGCCCA ACGACCCCCG 

451 CCCATTGACG TCAATAATGA CGTATGTTCC CATAGTAACG CCAATAGGGA 

501 CTTTCCATTG ACGTCAATGG GTGGAGTATT TACGGTAAAC TGCCCACTTG 

551 GCAGTACATC AAGTGTATCA TATGCCAAGT ACGCCCCCTA TTGACGTCAA 

601 TGACGGTAAA TGGCCCGCCT GGCATTATGC CCAGTACATG ACCTTATGGG 

651 ACTTTCCTAC TTGGCAGTAC ATCTACGTAT TAGTCATCGC TATTACC ATG 

701 GTGATGCGGT TTTGGCAGTA CATCAATGGG CGTGGATAGC GGTTTGACTC 

751 ACGGGGATTT CCAAGTCTCC ACCCCATTGA CGTCAATGGG AGTTTGTTTT 

801 GGCACCAAAA TCAACGGGAC TTTCCAAAAT GTCGTAACAA CTCCGCCCCA 

851 TTG AC GC AAA TGGGCGGTAG GCGTGTACGG TGGGAGGTCT ATATAAGCAG 

901 AGCTCGTTTA GTGAACCCTC AGATCGCCTG GAGACGCCAT CGACGCTGTT 

951 TTGACCTCCA TAGAAGACAC CGGGACCGAT CCAGCCTCCG CGGCCGGGAA 

1001 CGGTGCATTG GAACGCGGAT TCCCCGTGCC AAGAGTGACG TAAGTACCGC 

1051 CTATAGACTC TATAGGCACA CCCCTTTGGC TCTTATGCAT GCTATACTGT 

1101 TTTTGGCTTG GGGCCTATAC ACCCCCGCTT CCTTATGCTA TAGGTGATGG 

1151 TATAGCTTAG CCTATAGGTG TGGGTTATTG ACCATTATTG ACCACTCCCC 

1201 TATTGGTGAC GATACTTTCC ATTACTAATC CATAACATGG CTCTTTGCCA 

1251 CAACTATCTC TATTGGCTAT ATGCCAATAC TCTGTCCTTC AGAGACTGAC 

1301 ACGGACTCTG TATTTTTACA GGATGGGGTC CCATTTATTA TTTACAAATT 

1351 CACATATACA ACAACGCCGT CCCCCGTGCC CGCAGTTTTT ATTAAACATA 

1401 GCGTGGGATC TCCACGCGAA TCTCGGGTAC GTGTTCCGGA CATGGGCTCT 

1451 TCTCCGGTAG CGGCGGAGCT TCCACATCCG AGCCCTGGTC CCATGCCTCC 

1501 AGCGGCTCAT GGTCGCTCGG CAGCTCCTTG CTCCTAACAG TGGAGGCCAG 

1551 ACTTAGGCAC AGC AC AATGC CCACCACCAC CAGTGTGCCG CACAAGGCCG 

1601 TGGCGGTAGG GTATGTGTCT GAAAATGAGC GTGGAGATTG GGCTCGCACG 

1651 GCTGACGCAG ATGGAAGACT TAAGGCAGCG GCAGAAGAAG ATGCAGGCAG 

1701 CTGAGTTGTT GTATTCTGAT AAGAGTCAGA GGTAACTCCC GTTGCGGTGC 

1751 TGTTAACGGT GGAGGGCAGT GTAGTCTGAG CAGTACTCGT TGCTGCCGCG 

1801 CGCGCCACCA GACATAATAG CTGACAGACT AACAGACTGT TCCTTTCCAT 

1851 GGGTCTTTTC TGCAGTCACC GTCCTTAGAT CTAGGTACCA GATATCAGAA 

1901 TTCAGTCGAC AGCGGCCGCG ATCTGCTGTG CCTTCTAGTT GCCAGCCATC 

1951 TGTTGTTTGC CCCTCCCCCG TGCCTTCCTT GACCCTGGAA GGTGCCACTC 

2001 CCACTGTCCT TTCCTAATAA AATGAGGAAA TTGCATCGCA TTGTCTGAGT 

2051 AGGTGTCATT CTATTCTGGG GGGTGGGGTG GGGCAGGACA GCAAGGGGGA 
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2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 



GGATTGGGAA 
CCGCTGCGGC 
AGAAGCAGGC 
TTCTTAGTTC 
GCCTTCAATC 
TCAGCCCACC 
AAGATAGGCT 
GAAGTAATGA 
TCGCTGCGCT 
GGCGGTAATA 
TGTGAGCAAA 
CTGGCGTTTT 
ACGCTCAAGT 
CGTTTCCCCC 
CTTACCGGAT 
TCATAGCTCA 
AGCTGGGCTG 
TCCGGTAACT 
ACTGGCAGCA 
GTGCTACAGA 
ACAGTATTTG 
AGTTGGTAGC 
TTTTTGTTTG 
GATCCTTTGA 
ACGTTAAGGG 
TCCTTTTAAA 
TAAACTTGGT 
AGCGATCTGT 
GGCGCTGAGG 
TGAATCGCCC 
AGCTTTGTTG 
CGGAACGGTC 
GCAAAAGTTC 
ATGCTCTGCC 
ATCGAGCATC 
CATATTTTTG 
CAGTTCCATA 
TCCAACATCA 
CAAGTGAGAA 
AGCTTATGCA 
GTCATCAAAA 
CCTGAGCGAG 



GACAATAGCA 
CAGGTGCTGA 
ACATCCCCTT 
CAGCCCCACT 
CCACCCGCTA 
AAACCAAACC 
ATTAAGTGCA 
GAGAAATCAT 
CGGTCGTTCG 
CGGTTATCCA 
AGGCCAGCAA 
TCCATAGGCT 
CAGAGGTGGC 
TGGAAGCTCC 
ACCTGTCCGC 
CGCTGTAGGT 
TGTGCACGAA 
ATCGTCTTGA 
GCCACTGGTA 
GTTCTTGAAG 
GTATCTGCGC 
TCTTGATCCG 
CAAGCAGCAG 
TCTTTTCTAC 
ATTTTGGTCA 
TTAAAAATGA 
CTGACAGTTA 
CTATTTCGTT 
TCTGCCTCGT 
CATCATCCAG 
TAGGTGGACC 
TGCGTTGTCG 
GATTTATTCA 
AGTGTTACAA 
AAATGAAACT 
AAAAAGCCGT 
GGATGGCAAG 
ATACAACCTA 
ATCACCATGA 
TTTCTTTCCA 
TCACTCGCAT 
ACGAAATACG 



GGCATGCTGG 
AGAATTGACC 
CTCTGTGACA 
CATAGGACAC 
AAGTACTTGG 
TAGCCTCCAA 
GAGGGAGAGA 
AGAATTTCTT 
GCTGCGGCGA 
CAGAATCAGG 
AAGGCCAGGA 
CCGCCCCCCT 
GAAACCCGAC 
CTCGTGCGCT 
CTTTCTCCCT 
ATCTCAGTTC 
CCCCCCGTTC 
GTCCAACCCG 
ACAGGATTAG 
TGGTGGCCTA 
TCTGCTGAAG 
GCAAACAAAC 
ATTACGCGCA 
GGGGTCTGAC 
TGAGATTATC 
AGTTTTAAAT 
CCAATGCTTA 
CATCCATAGT 
GAAGAAGGTG 
CCAGAAAGTG 
AGTTGGTGAT 
GGAAGATGCG 
ACAAAGCCGC 
CCAATTAACC 
GCAATTTATT 
TTCTGTAATG 
ATCCTGGTAT 
TTAATTTCCC 
GTGACGACTG 
GACTTGTTCA 
CAACCAAACC 
CGATCGCTGT 



GGATGCGGTG 
CGGTTCCTCC 
CACCCTGTCC 
TCATAGCTCA 
AGCGGTCTCT 
GAGTGGGAAG 
AAATGCCTCC 
CCGCTTCCTC 
GCGGTATCAG 
GGATAACGCA 
ACCGTAAAAA 
GACGAGCATC 
AGGACTATAA 
CTCCTGTTCC 
TCGGGAAGCG 
GGTGTAGGTC 
AGCCCGACCG 
GTAAGACACG 
CAGAGCGAGG 
ACTACGGCTA 
CCAGTTACCT 
CACCGCTGGT 
GAAAAAAAGG 
GCTCAGTGGA 
AAAAAGGATC 
CAATCTAAAG 
ATCAGTGAGG 
TGCCTGACTC 
TTGCTGACTC 
AGGGAGCCAC 
TTTGAACTTT 
TGATCTGATC 
CGTCCCGTCA 
AATTCTGATT 
CATATCAGGA 
AAGGAGAAAA 
CGGTCTGCGA 
CTCGTCAAAA 
AATCCGGTGA 
ACAGGCCAGC 
GTTATTCATT 
TAAAAGGACA 



GGCTCTATGG 
TGGGCCAGAA 
ACGCCCCTGG 
GGAGGGCTCC 
CCCTCCCTCA 
AAATTAAAGC 
AACATGTGAG 
GCTCACTGAC 
CTCACTCAAA 
GGAAAGAACA 
GGCCGCGTTG 
ACAAAAATCG 
AGATACCAGG 
GACCCTGCCG 
TGGCGCTTTC 
GTTCGCTCCA 
CTGCGCCTTA 
ACTTATCGCC 
TATGTAGGCG 
CACTAGAAGA 
TCGGAAAAAG 
AGCGGTGGTT 
ATCTCAAGAA 
ACGAAAACTC 
TTCACCTAGA 
TATATATGAG 
CACCTATCTC 
GGGGGGGGGG 
ATACCAGGCC 
GGTTGATGAG 
TGCTTTGCCA 
CTTCAACTCA 
AGTCAGCGTA 
AGAAAAACTC 
TTATCAATAC 
CTCACCGAGG 
TTCCGACTCG 
ATAAGGTTAT 
GAATGGCAAA 
CATTACGCTC 
CGTGATTGCG 
ATTACAAACA 
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4201 GGAATCGAAT GCAACCGGCG CAGGAACACT GCCAGCGCAT CAACAATATT 

4251 TTCACCTGAA TCAGGATATT CTTCTAATAC CTGGAATGCT GTTTTCCCGG 

4301 GGATCGCAGT GGTGAGTAAC CATGCATCAT CAGGAGTACG GATAAAATGC 

4351 TTGATGGTCG GAAGAGGCAT AAATTCCGTC AGCCAGTTTA GTCTGACCAT 

4401 CtCATCTGTA ACATCATTGG CAACGCTACC TTTGCCATGT TTCAGAAACA 

4451 ACTCTGGCGC ATCGGGCTTC CCATACAATC GATAGATTGT CGCACCTGAT 

4501 TGCCCGACAT TATCGCGAGC CCATTTATAC CCATATAAAT CAGCATCCAT 

4551 GTTGGAATTt AATCGCGGCC TCGAGCAAGA CGTTTCCCGT TGAATATGGC 

4601 TCATAACACd CCTTGTATTA CTGTTTATGT AAGCAGACAG TTTTATTGTT 

4651 CATGATGATA TATTTTTATC TTGTGCAATG TAACATCAGA GATTTTGAGA 

4701 CACAACGTGG CTTTCCCCCC CGCCCCATTA TTGAAGCATT TATCAGGGTT 

4751'. ATTGTCTCAT GAGCGGATAG ATATTTGAAT GTATTTAGAA AAATAAACAA 

4801 ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCACCTG ACGTCTAAGA 

4851 AACCATTATT ATCATGACAT TAACCTATAA AAATAGGCGT ATCACGAGGC 

4901 CCTTTCGTC 
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1 CATCATCAAT AATATACCTT ATTTTGGATT GAAGCCAATA TGATAATGAG GGGGTGGAGT . 
61 TTGTGACGTG GCGCGGGGCG TGGGAACGGG GCGGGTGACG TAGTAGTGTG GCGGAAGTGT 
121 GATGTTGTAA GTGTGGCGGA ACACATGTAA GCGCCGGATG TGGTAAAAGT GACGTTTTTG 
181 GTGTGCGCCG GTGTACACGG GAAGTGACAA TTTTCGCGCG GTTTTAGGCG GATGTTGTAG 
241 TAAATTTGGG CGTAACCAAG TAATATTTGG CCATTTTCGC GGGAAAACTG AATAAGAGGA 
3 01 AGTGAAATCT GAATAATTCT GTGTTACTCA TAGCGCGTAA TATTTGTCTA GGGCCGCGGG 
361 GACTTTGACC GTTTACGTGG AGACTCGCCC AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 
421 CGGGTCAAAG TTGGCGTTTT ATTATTATAG TCAGCTGACG CGCAGTGTAT TTATACCCGG 
481 TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 
541 TCCGACACCG GGACTGAAAA TGAGACATAT TATCTGCCAC GGAGGTGTTA TTACCGAAGA 
601 AATGGCCGCC AGTCTTTTGG ACCAGCTGAT CGAAGAGGTA CTGGCTGATA ATCTTCCACC 
661 TCCTAGCCAT TTTGAACCAC CTACCCTTCA CGAACTGTAT GATTTAGACG TGACGGCCCC 
721 CGAAGATCCC AACGAGGAGG CGGTTTCGCA GATTTTTCCC GAGTCTGTAA TGTTGGCGGT 
781 GCAGGAAGGG ATTGACTTAT TCACTTTTCC GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA 
841 CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA GAGAGCCTTG GGTCCGGTTT CTATGCCAAA 
901 CCTTGTGCCG GAGGTGATCG ATCTTACCTG CCACGAGGCT GGCTTTCCAC CCAGTGACGA 
961 CGAGGATGAA GAGGGTGAGG AGTTTGTGTT AGATTATGTG GAGCACCCCG GGCACGGTTG 
1021 CAGGTCTTGT CATTATCACC GGAGGAATAC GGGGGACCCA GATATTATGT GTTCGCTTTG 
1081 CTATATGAGG ACCTGTGGCA TGTTTGTCTA CAGTAAGTGA AAAATTATGG GCAGTGGGTG 
1141 ATAGAGTGGT GGGTTTGGTG TGGTAATTTT TTTTTTAATT TTTACAGTTT TGTGGTTTAA 
1201 AGAATTTTGT ATTGTGATTT TTTAAAAGGT CCTGTGTCTG AACCTGAGCC TGAGCCCGAG 
1261 CCAGAACCGG AGCCTGCAAG ACCTACCCGG CGTCCTAAAT TGGTGCCTGC TATCCTGAGA 
1321 CGCCCGACAT CACCTGTGTC TAGAGAATGC AATAGTAGTA CGGATAGCTG TGACTCCGGT 
1381 CCTTCTAACA CACCTCCTGA GATACACCCG GTGGTCCCGC TGTGCCCCAT TAAACCAGTT 
1441 GCCGTGAGAG TTGGTGGGCG TCGCCAGGCT GTGGAATGTA TCGAGGACTT GCTTAACGAG 
1501 TCTGGGCAAC CTTTGGACTT GAGCTGTAAA CGCCCCAGGC CATAAGGTGT AAACCTGTGA 
1561 TTGCGTGTGT GGTTAACGCC TTTGTTTGCT GAATGAGTTG ATGTAAGTTT AATAAAGGGT 
1621 GAGATAATGT TTAACTTGCA TGGCGTGTTA AATGGGGCGG GGCTTAAAGG GTATATAATG 
1681 CGCCGTGGGC TAATCTTGGT TACATCTGAC CTCATGGAGG CTTGGGAGTG TTTGGAAGAT 
1741 TTTTCTGCTG TGCGTAACTT GCTGGAACAG AGCTCTAACA GTACCTCTTG GTTTTGGAGG 
1801 TTTCTGTGGG GCTCCTCCCA GGCAAAGTTA GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 
1861 GAATTTGAAG AGCTTTTGAA ATCCTGTGGT GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 
1921 CAGGCGCTTT TCCAAGAGAA GGTCATCAAG ACTTTGGATT TTTCCACACC GGGGCGCGCT 
1981 GCGGCTGCTG TTGCTTTTTT GAGTTTTATA AAGGATAAAT GGAGCGAAGA AACCCATCTG 
2041 AGCGGGGGGT ACCTGCTGGA TTTTCTGGCC ATGCATCTGT GGAGAGCGGT GGTGAGACAC 
2101 AAGAATCGCC TGCTACTGTT GTCTTCCGTC CGCCCGGCAA TAATACCGAC GGAGGAGCAA 
2161 CAGCAGGAGG AAGCCAGGCG GCGGCGGCGG CAGGAGCAGA GCCCATGGAA CCCGAGAGCC 
2221 GGCCTGGACC CTCGGGAATG AATGTTGTAC AGGTGGCTGA ACTGTTTCCA GAACTGAGAC 
2281 GCATTTTAAC CATTAACGAG GATGGGCAGG GGCTAAAGGG GGTAAAGAAG GAGCGGGGGG 
2341 CTTCTGAGGC TACAGAGGAG GCTAGGAATC TAACTTTTAG CTTAATGACC AGACACCGTC 
2401 CTGAGTGTGT TACTTTTCAG CAGATTAAGG ATAATTGCGC TAATGAGCTT GATCTGCTGG 
2461 CGCAGAAGTA TTCCATAGAG CAGCTGACCA CTTACTGGCT GCAGCCAGGG GATGATTTTG 
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2521 AGGAGGCTAT TAGGGTATAT GCAAAGGTGG CACTTAGGCC AGATTGCAAG TACAAGATTA^ 
2581 GCAAACTTGT AAATATCAGG AATTGTTGCT ACATTTCTGG GAACGGGGCC GAGGTGGAGA 
2641 TAGATACGGA GGATAGGGTG GCCTTTAGAT GTAGCATGAT AAATATGTGG CCGGGGGTGC - 
2701 TTGGCATGGA CGGGGTGGTT ATTATGAATG TGAGGTTTAC TGGTCCCAAT TTTAGCGGTA - 
2761 CGGTTTTCCT GGCCAATACC AATCTTATCC TACACGGTGT AAGCTTC TAT GGGTTTAACA 
2821 ATACCTGTGT GGAAGCCTGG ACCGATGTAA GGGTTCGGGG CTGTGCCTTT TACTGCTGCT 
2881 GGAAGGGGGT GGTGTGTCGC CCCAAAAGCA GGGCTTCAAT TAAGAAATGC CTGTTTGAAA 
2941 GGTGTACCTT GGGTATCCTG TCTGAGGGTA ACTCCAGGGT GCGCCACAAT GTGGCCTCCG 
3001 ACTGTGGTTG CTTTATGCTA GTGAAAAGCG TGGCTGTGAT TAAGCATAAC ATGGTGTGTG , 
3061 GCAACTGCGA GGACAGGGCC TCTCAGATGC TGACCTGCTC GGACGGCAAG TGTC ACTTGC 
3121 TGAAGACCAT TCACGTAGCC AGCCACTCTC GCAAGGCCTG GCCAGTGTTT GAGCACAACA 
3181 TACTGACCCG CTGTTCCTTG CATTTGGGTA ACAGGAGGGG GGTGTTCCTA CCTTACCAAT 
3241 GCAATTTGAG TCACACTAAG ATATTGCTTG AGCCCGAGAG CATGTCCAAG GTGAACCTGA 
3301 ACGGGGTGTT TGACATGACC ATGAAGATCT GGAAGGTGCT GAGGTACGAT GAGACCCGCA 
3361 CCAGGTGCAG ACCCTGCGAG TGTGGCGGTA AACATATTAG GAACCAGCCT GTGATGCTGG 
; 3421 ATGTGACCGA GGAGCTGAGG CCCGATCACT TGGTGCTGGC CTGCACCCGC GCTGAGTTTG 
3481 GCTCTAGCGA TGAAGATACA GATTGAGGTA CTGAAATGTG TGGGCGTGGC TTAAGGGTGG 
3541 GAAAGAATAT ATAAGGTGGG GGTCTCATGT AGTTTTGTAT CTGTTTTGCA GCAGCCGCCG 
3601 CCATGAGCGC. CAACTCGTTT GATGGAAGCA TTGTGAGCTC ATATTTGACA ACGCGCATGC 
3661 CCCCATGGGC CGGGGTGCGT CAGAATGTGA TGGGCTCCAG CATTGATGGT CGCCCCGTCC 
3721 TGCCCGCAAA CTCTACTACC TTGACCTACG AGACCGTGTC TGGAACGCCG TTGGAGACTG 
3781 CAGCCTCCGC CGCCGCTTCA GCCGCTGCAG CCACCGCCCG CGGGATTGTG ACTGACTTTG 
3841 CTTTCCTGAG CCCGCTTGCA AGCAGTGCAG CTTCCCGTTC ATCCGCCCGC GATGACAAGT 
3901 TGACGGCTCT TTTGGCACAA TTGGATTCTT TdACCCGGGA ACTTAATGTC GTTTCTCAGG 
3961 AGCTGTTGGA TCTGCGCCAG CAGGTTTCTG CCCTGAAGGC TTCCTCCCCT CCCAATGCGG 
4021 TTTAAAACAT AAATAAAAAC CAGACTCTGT TTGGATTTGG ATCAAGCAAG TGTCTTGCTG 
" 4081 TCTTTATTTA GGGGTTTTGC GCGCGCGGTA GGCCCGGGAC CAGCGGTCTC GGTCGTTGAG 
4141 GGTCCTGTGT ATTTTTTCCA GGACGTGGTA AAGGTGACTC TGGATGTTCA GATACATGGG 
4201 CATAAGCCCG TCTCTGGGGT GGAGGTAGCA CCACTGCAGA GCTTCATGCT GCGGGGTGGT 
4261 GTTGTAGATG ATCCAGTCGT AGCAGGAGCG CTGGGCGTGG TGCCTAAAAA TGTCTTTCAG 
4321 TAGCAAGCTG ATTGCCAGGG GCAGGCCCTT GGTGTAAGTG TTTACAAAGC GGTTAAGCTG 
4381 GGATGGGTGC ATACGTGGGG ATATGAGATG CATCTTGGAC TGTATTTTTA GGTTGGCTAT 
4441 GTTCCCAGCC ATATCCCTCC GGGGATTCAT GTTGTGCAGA ACCACCAGCA CAGTGTATCC 
.4501 GGTGCACTTG GGAAATTTGT CATGTAGCTT AGAAGGAAAT GCGTGGAAGA ACTTGGAGAC 
4561 GCCCTTGTGA CCTCCAAGAT TTTCCATGCA TTCGTCCATA ATGATGGCAA TGGGCCCACG 
4621 GGCGGCGGCC TGGGCGAAGA TATTTCTGGG ATCACTAACG TCATAGTTGT GTTCCAGGAT 
4681 GAGATCGTC A TAGGCCATTT TTACAAAGCG CGGGCGGAGG GTGCCAGACT GCGGT ATAAT 
4741 GGTTCCATCC GGCCCAGGGG CGTAGTTACC CTCACAGATT TGCATTTCCC ACGCTTTGAG 
4801 TTCAGATGGG GGGATCATGT CTACCTGCGG GGCGATGAAG AAAACCGTTT CCGGGGTAGG 
4861 GGAGATCAGC TGGGAAGAAA GCAGGTTCCT AAGCAGCTGC GACTTACCGC AGCCGGTGGG 
4921 CCCGTAAATC ACACCTATTA CCGGCTGCAA CTGGTAGTTA AGAGAGCTGC AGCTGCCGTC: 
4981 ATCCCTGAGC AGGGGGGCCA CTTCGTTAAG CATGTCCCTG ACTTGC ATGT TTTCCCTGAC 
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5041 CAAATCCGCC AGAAGGCGCT CGCCGCCCAG CGATAGCAGT TCTTGCAAGG AAGCAAAGTT 
5101 TTTCAACGGT TTGAGGCCGT CCGCCGTAGG CATGCTTTTG AGCGTTTGAC CAAGCAGTTC 
5161 CAGGCGGTCC CACAGCTCGG TCACGTGCTC TACGGCATCT CGATCCAGCA TATCTCCTCG 
5221 TTTCGCGGGT TGGGGCGGCT TTCGCTGTAC GGCAGTAGTC GGTGCTCGTC CAGACGGGCC 
5281 AGGGTCATGT CTTTCCACGG GCGCAGGGTC CTCGTCAGCG TAGTCTGGGT CACGGTGAAG 
5341 GGGTGCGCTC CGGGTTGCGC GCTGGCCAGG GTGCGCTTGA GGCTGGTCCT GCTGGTGCTG 
5401 AAGCGCTGCC GGTCTTCGCC CTGCGCGTCG GCCAGGTAGC ATTTGACCAT GGTGTCATAG 
5461 TCCAGCCCCT CCGCGGCGTG GCCCTTGGCG CGCAGCTTGC CCTTGGAGGA GGCGCCGCAC 
5521 GAGGGGCAGT GCAGACTTTT AAGGGCGTAG AGCTTGGGCG CGAGAAATAC CGATTCCGGG 
5581 GAGTAGGCAT CCGCGCCGCA GGCCCCGCAG ACGGTCTCGC ATTCCACGAG CCAGGTGAGC 
5641 TCTGGCCGTT CGGGGTCAAA AACCAGGTTT CCCCCATGCT TTTTGATGCG TTTCTTACCT 
5701 CTGGTTTCCA TGAGCCGGTG TCCACGCTCG GTGACGAAAA GGCTGTCCGT GTCCCCGTAT 
5761 ACAGACTTGA GAGGCCTGTC CTCGAGCGGT GTTCCGCGGT CCTCCTCGTA TAGAAACTCG 
5821 GACCACTCTG AGACGAAGGC TCGCGTCCAG GCCAGCACGA AGGAGGCTAA GTGGGAGGGG 
5881 TAGCGGTCGT TGTCCACTAG GGGGTCCACT CGCTCCAGGG TGTGAAGACA CATGTCGCCC 
5941 TCTTCGGCAT CAAGGAAGGT GATTGGTTTA TAGGTGTAGG CCACGTGACC GGGTGTTCCT 
6001 GAAGGGGGGC TATAAAAGGG GGTGGGGGCG CGTTCGTCCT CACTCTCTTC CGCATCGCTG 
6061 TCTGCGAGGG CCAGCTGTTG GGGTGAGTAC TCCCTCTCAA AAGCGGGCAT GACTTCTGCG 
6121 CTAAGATTGT CAGTTTCCAA AAACGAGGAG GATTTGATAT TCACCTGGCC CGCGGTGATG 
6181 CCTTTGAGGG TGGCCGCGTC CATCTGGTCA GAAAAGACAA TCTTTTTGTT GTCAAGCTTG 
6241 GTGGCAAACG ACCCGTAGAG GGCGTTGGAC AGCAACTTGG CGATGGAGCG CAGGGTTTGG 
6301 TTTTTGTCGC GATCGGCGCG CTCCTTGGCC GCGATGTTTA GCTGCACGTA TTCGCGCGCA 
6361 ACGCACCGCC ATTCGGGAAA GACGGTGGTG CGCTCGTCGG GCACTAGGTG CACGCGCCAA 
6421 CCGCGGTTGT GCAGGGTGAC AAGGTCAACG CTGGTGGCTA CCTCTCCGCG TAGGCGCTCG 
6481 TTGGTCCAGC AGAGGCGGCC GCCCTTGCGC GAGCAGAATG GCGGTAGTGG GTCTAGCTGC 
6541 GTCTCGTCCG GGGGGTCTGC GTCCACGGTA AAGACCCCGG GCAGCAGGCG CGCGTCGAAG 
6601 TAGTCTATCT TGCATCCTTG CAAGTCTAGC GCCTGCTGCC ATGCGCGGGC GGCAAGCGCG 
6661 CGCTCGTATG GGTTGAGTGG GGGACCCCAT GGCATGGGGT GGGTGAGCGC GGAGGCGTAC 
6721 ATGCCGCAAA TGTCGTAAAC GTAGAGGGGC TCTCTGAGTA TTCCAAGATA TGTAGGGTAG 
6781 CATCTTCCAC CGCGGATGCT GGCGCGCACG TAATCGTATA GTTCGTGCGA GGGAGCGAGG 
6841 AGGTCGGGAC CGAGGTTGCT ACGGGCGGGC TGCTCTGCTC GGAAGACTAT CTGCCTGAAG 
6901 ATGGCATGTG AGTTGGATGA TATGGTTGGA CGCTGGAAGA CGTTGAAGCT GGCGTCTGTG 
6961 AGACCTACCG CGTCACGCAC GAAGGAGGCG TAGGAGTCGC GCAGCTTGTT GACCAGCTCG 
7021 GCGGTGACCT GCACGTCTAG GGCGCAGTAG TCCAGGGTTT CCTTGATGAT GTCATACTTA 
7081 TCCTGTCCCT TTTTTTTCCA CAGCTCGCGG TTGAGGACAA ACTCTTCGCG GTCTTTCCAG 
7141 TACTCTTGGA TCGGAAACCC GTCGGCCTCC GAACGGTAAG AGCCTAGCAT GTAGAACTGG 
7201 TTGACGGCCT GGTAGGCGCA GCATCCCTTT TCTACGGGTA GCGCGTATGC CTGCGCGGCC 
7261 TTCCGGAGCG AGGTGTGGGT GAGCGCAAAG GTGTCCCTAA CCATGACTTT GAGGTACTGG 
7321 TATTTGAAGT CAGTGTCGTC GCATCCGCCC TGCTCCCAGA GCAAAAAGTC CGTGCGCTTT 
7381 TTGGAACGCG GGTTTGGCAG GGCGAAGGTG ACATCGTTGA AGAGTATCTT TCCCGCGCGA 
7441 GGCATAAAGT TGCGTGTGAT GCGGAAGGGT CCCGGCACCT CGGAACGGTT GTTAATTACC 
7501 TGGGCGGCGA GCACGATCTC GTCAAAGCCG TTGATGTTGT GGCCCACAAT GTAAAGTTCC 
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7561 AAGAAGCGCG GGATGCCCTT GATGGAAGGC AATTTTTTAA GTTCCTCGTA GGTGAGCTCT " 
7621 TCAGGGGAGC TGAGCCCGTG CTCTGAAAGG GCCCAGTCTG CAAGATGAGG GTTGGAAGCG 
7681 ACGAATGAGC TCCACAGGTC ACGGGCCATT AGCATTTGCA GGTGGTCGCG AAAGGTCCTA; 
7741 AACTGGCGAC CTATGGCCAT TTTTTCTGGG . GTGATGC AGT AGAAGGTAAG CGGGTCTTGT' 
7801 TCCCAGCGGT CCCATCCAAG GTCCGCGGCT AGGTCTCGCG CGGCGGTCAC TAGAGGCTCA 
7861 TCTCCGCCGA ACTTCATGAC GAGCATGAAG GGCACGAGCT GCTTCCCAAA GGCCCCCATC, 
7921 CAAGTATAGG TCTCTACATC GTAGGTGACA AAGAGACGCT CGGTGCGAGG ATGCGAGCCG ! 
7981 ATCGGGAAGA ACTGGATCTC CCGCCACCAG TTGGAGGAGT GGCTGTTGAT GTGGTGAAAG 
8041 TAGAAGTCCC TGCGACGGGC CGAACACTCG TGCTGGCTTT TGTAAAAACG TGCGCAGTAC 
8101 TGGCAGCGGT GCACGGGCTG TACATCCTGC ACGAGGTTGA CCTGACGACC GCGCACAAGG 
8161 AAGCAGAGTG GGAATTTGAG CCCCTCGCCT GGCGGGTTTG GCTGGTGGTC TTCTACTTCG 
8221 GCTGCTTGTC CTTGACCGTC TGGCTGCTCG AGGGGAGTTA CGGTGGATCG GACCACCACG 
8281 CCGCGCGAGC CCAAAGTCCA GATGTCCGCG CGCGGCGGTC GGAGCTTGAT GACAACATCG 
8341 CGCAGATGGG AGCTGTCCAT GGTCTGGAGC TCCCGCGGCG TCAGGTCAGG CGGGAGCTCC 
8401 TGCAGGTTTA CCTCGCATAG CCGGGTCAGG GCGCGGGCTA GGTCCAGGTG ATACCTG ATT 
8461 TCCAGGGGCT GGTTGGTGGC GGCGTCGATG GCTTGCAAGA GGCCGCATCC CCGCGGCGCG 
8521 ACTACGGTAC CGCGCGGCGG GCGGTGGGCC GCGGGGGTGT CCTTGGATGA TGCATCTAAA 
8581 AGCGGTGACG CGGGCGGGCC CCCGGAGGTA GGGGGGGCTC GGGACCCGCC GGGAGAGGGG 
8641 GCAGGGGCAC GTCGGCGCCG CGCGCGGGCA GGAGCTGGTG CTGCGCGCGG AGGTTGCTGG 
8701 CGAACGCGAC GACGCGGCGG TTGATCTCCT GAATCTGGCG CCTCTGCGTG AAGACGACGG 
8761 GCCCGGTGAG CTTGAACCTG AAAGAGAGTT CGACAGAATC AATTTCGGTG TCGTTGACGG. 
8821 CGGCCTGGCG CAAAATCTCC TGCACGTCTC CTGAGTTGTC TTGATAGGCG ATCTCGGCCA 
8881 TGAACTGCTC GATCTCTTCC TCCTGGAGAT CTCCGCGTCC GGCTCGCTCC ACGGTGGCGG 
8941 CGAGGTCGTT GGAGATGCGG GCCATGAGCT GCGAGAAGGC GTTGAGGCCT CCCTCGTTCC 
9001 AGACGCGGCT GTAGACCACG CCCCCTTCGG CATCGCGGGC GCGCATGACC ACCTGCGCGA 
9061 GATTGAGCTC CACGTGCCGG GCGAAGACGG CGTAGTTTCG CAGGCGCTGA AAGAGGTAGT 
9121 TGAGGGTGGT GGCGGTGTGT TCTGCCACGA AGAAGTACAT AACCCAGCGC CGCAACGTGG 
9181 ATTCGTTGAT ATCCCCCAAG GCCTCAAGGC GCTCCATGGC CTCGTAGAAG TCCACGGCGA 
9241 AGTTGAAAAA CTGGGAGTTG CGCGCCGACA CGGTTAACTC CTCCTCCAGA AGACGGATGA 
9301 GCTCGGCGAC AGTGTCGCGC ACCTCGCGCT CAAAGGCTAC AGGGGCCTCT TCTTCTTCTT 
9361 CAATCTCCTC TTCCATAAGG GCCTCCCCTT CTTCTTCTTC TGGCGGCGGT GGGGGAGGGG 
9421 GGACACGGCG GCGACGACGG CGCACCGGGA GGCGGTCGAC AAAGCGCTCG ATCATCTCCC 
9481 CGCGGCGACG GCGCATGGTC TCGGTGACGG CGCGGCCGTT CTCGCGGGGG CGCAGTTGGA 
9541 AGACGCCGCC CGTCATGTCC CGGTTATGGG TTGGCGGGGG GCTGCCGTGC GGCAGGGATA 
9601 CGGCGCTAAC GATGCATCTC AACAATTGTT GTGTAGGTAC TCCGCCACCG AGGGACCTGA 
9661 GCGAGTCCGC ATCGACCGGA TCGGAAAACC TCTCGAGAAA GGCGTCTAAC CAGTCACAGT 
9721 CGCAAGGTAG GCTGAGCACC GTGGCGGGCG GCAGCGGGCG GCGGTCGGGG TTGTTTCTGG 
9781 CGGAGGTGCT GCTGATGATG TAATTAAAGT AGGCGGTCTT GAGACGGCGG ATGGTCGACA 
9841 GAAGCACCAT GTCCTTGGGT CCGGCCTGCT GAATGCGCAG GCGGTCGGCC ATGCCCCAGG 
9901 CTTCGTTTTG ACATCGGCGC AGGTCTTTGT AGTAGTCTTG CATGAGCCTT TCTACCGGCA 
9961 CTTCTTCTTC TCCTTCCTCT TGTCCTGCAT CTCTTGCATC TATCGCTGCG GCGGCGGCGG 
10021 AGTTTGGCCG TAGGTGGCGC CCTCTTCCTC CCATGCGTGT GACCCCGAAG CCCCTCATCG 
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10081 GCTGAAGCAG GGCCAGGTCG GCGACAACGC GCTCGGCTAA TATGGCCTGC TGCACCTGCG 

oi tIa^taga ct^aagtcg tkca^ca caaagcggtg gtatgcgccc otgttga^ 
10201 tgtaagtgca gttggccata acggaccagt taacggtctg gtgacccggc tgcgagagct 

10261 CGGTGTACCT GAGACGC GAG TAAGCCCTTG AGTCAAAGAC GTAGTCGTTG CAAGTCCGCA 

io 3 2 ccIStact* gtatcccacc aaaaagtgcg gcggcggctg gcggtagagg ogccagcgta 
33 gggtggccgg ggctccgggg gcgaggtctt ccaacataag gcgatgatat ccgtaga^ 

10441 ACC TGGAC AT CCAGGTGATG CCGGCGGCGG TGGTGGAGGC GCGCGGAAAG ^CGGACGC 

10501 GGTTCCAGAT gttgcgcagc ggcaaaaagt gctccatggt cgggacgctc ^ccggtca 

io561 GGCGCGCGCA GTCGTTGACG CTCTAGACCG TGCAAAAGGA GAGCCTGTAA GCGGGCACTC 

°62i ttccgtggtc tggtggataa attcgcaagg gtatcatggc ggacgaccgg ggttcgaacc 

1 1 C^GA^CGG CCGTCCGCCG TGATCCATGC GGTTACCGCC CGCGTGTCGA ACCCAGG^ 
10741 GCGACGTCAG ACAACGGGGG AGCGCTCCTT TTGGCTTCCT TCCAGGCGCG OCOGATGCTG 
10801 CGCTAGCTTT TTTGGCCACT GGCCGCGCGC GGCGTAAGCG GTTAGGCTGG AAAGCGAAAG 

"0861 caSaLgg ctcgctccct gtagccggag ggttattttc caagggttga gtcgcgggac 

1 921 CCCCGGTTCG AGTCTCGGGC CGGCCGGACT GCGGCGAACG ^TTTGCC TCCCCGTCAT 
10981 GCAAGACCCC GCTTGCAAAT TCCTCCGGAA ACAGGGACGA GCCCCTTTTT TGCTTTTCCC 
11041 AGATGCATCC GGTGCTGCGG CAGATGCGCC CCCCTCCTCA GCAGCGGCAA GAGCAAGAGC 
11101 AGCGGCAGAC ATGCAGGGCA CCCTCCCCTT CTGCTACCGC GTCAGGAGGG GCAACATCCG 
11161 CGGCTGACGC GGCGGCAGAT GGTGATTACG AACCCCCGCG GCGCCGGACC CGGCACTACT 
11221 TGGACTTGGA GGAGGGCGAG GGCCTGGCGC GGCTAGGAGC OCCCTCTCCT ^CACC 
11281 CAAGGGTGCA GCTGAAGCGT GACACGCGCG AGGCGTACGT GCCGCGGCAG AACCTGTTTC 
11341 GCGACCGCGA GGGAGAGGAG CCCGAGGAGA TGCGGGATCG AAAGTTC CAT ^GGGCGCG 
il401 SxTGCGGCA TGGCCTGAAC CGCGAGCGGT TGCTGCGCGA GGAGG AC TTT GAGCCCGACG 
11461 CGCGGACCGG GATTAGTCCC GCGCGCGCAC ACGTGGCGGC CGCCGACCTG ™CCGCGT 
""l ACGAGCAGAC GGTGAACCAG GAGATTAACT TTCAAAAAAG CTTTAACAAC CACGTGCGCA 
1 581 CGCTTGTGGC GCGCGAGGAG GTGGCTATAG GACTGATGCA TCTGTGGGAC 
11641 CGCTGGAGCA AAACCCAAAT AGCAAGCCGC TCATGGCGCA GCTGTTCCTT ATAGTGCAGC 
11701 ACAGCAGGGA CAACGAGGCA TTCAGGGATG CGCTGCTAAA CATAGTAGAG CCCGAGGGCC 
11761 GCTGGCTGCT CGATTTGATA AACATTCTGC AGAGCATAGT GGTGCAGGAG CGCAGCTTGA 
11821 GCCTGGCTGA CAAGGTGGCC GCCATTAACT ATTCCATGCT CAGTCTGGGC AAGTTTTACG 
18 CCCGCAAGAT ATACCATACC CCTTACGTTC CCATAGACAA GGAGGTAAAG ATGGAGGGGT 
11941 TCTACATGCG CATGGCGCTG AAGGTGCTTA CCTTGAGCGA CGACCTGGGC ^TCGCA 
12001 ACGAGCGCAT CCACAAGGCC GTGAGCGTGA GCCGGCGGCG CGAGCTCAGC GACCGCGAGC 
120 TGATGCACAG CCTGCAAAGG GCCC^GCTG GCACGGGCAG CGGCGATAGA GAGGCCGAGT 
12121 CCTACTTTGA CGCGGGCGCT GACCTGCGCT GGGCCCCAAG CCGACGCGCC ^AGGCAG 
12181 CTGGGGCCGG ACCTGGGCTG GCGGTGGCAC CCGCGCGCGC TGGCAACGTC GGCGGCGTGG 
12241 AGGAATATGA CGAGGACGAT GAGTACGAGC CAGAGGACGG CGAGTACTAA GCGGTGATGT 
12301 TTCTGATCAG ATGATGCAAG ACGCAACGGA CCCGGCGGTG CGGGCGGCGC TGCAGAGCCA 
12361 GCCGTCCGGC CTTAACTCCA CGGACGACTG GCGCCAGGTC ATGGACCGCA TCATGTCGCT 
12421 GACTGCGCGC AACCCTGACG CGTTCCGGCA GCAGCCGCAG GCCAACCGGC TCTCCGCAAT 
12481 TCTGGAAGCG GTGGTCCCGG CGCGCGCAAA CCCCACGCAC GAGAAGGTGC TGGCGATCGT 
12541 AAACGCGCTG GCCGAAAACA GGGCCATCCG GCCCGATGAG GCCGGCCTGG TCTACGACGC 
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12601 GCTGCTTCAG CGCGTGGCTC GTTACAACAG CAGCAACGTG CAGACCAACG TGGACCGGCT . 
12661 GGTGGGGGAT GTGCGCGAGG CGGTGGCGCA GCGTGAGCGC GCGCAGCAGC AGGGCAACCT ' 
12721 GGGCTCCATG GTTGCACTAA ACGCCTTCCT GAGTACACAG CCCGCCAACG TGCCGCGGGG 
12781 ACAGGAGGAC TACACCAACT TTGTGAGCGC ACTGCGGCTA ATGGTGACTG AGAC ACCGC A 

; 12841 AAGTGAGGTG TATCAGTCCG GGCCAGACTA TTTTTTCCAG ACCAGTAGAC AAGGCCTGCA 
12901 GACCGTAAAC CTGAGCCAGG CTTTCAAGAA CTTGCAGGGG CTGTGGGGGG TGCGGGCTCC 
12961 CACAGGCGAC CGCGCGACCG TGTCTAGCTT GCTGACGCCC AACTCGCGCC TGTTGCTGCT 
13021 GCTAATAGCG CCCTTCACGG ACAGTGGCAG CGTGTCCCGG GACACATACC TAGGTCACTT 
13081 GCTGACACTG TACCGCGAGG CCATAGGTCA GGCGCATGTG GACGAGCATA CTTTCCAGGA 
13141 GATTACAAGT GTTAGCCGCG CGCTGGGGCA GGAGGACACG GGCAGCCTGG AGGCAACCCT: 
13201 GAACTACCTG CTGACCAACC GGCGGCAAAA AATCCCCTCG TTGCACAGTT TAAACAGCGA 
13261 GGAGGAGCGC ATTTTGCGCT ATGTGCAGCA GAGC GTGAGC CTTAACCTGA TGCGCGACGG 
13321 GGTAACGCCC AGCGTGGCGC TGGACATGAC CGCGCGCAAC ATGGAACCGG GCATGTATGC 
13381 CTCAAACCGG CCGTTTATCA ATCGCCTAAT GGACTACTTG CATCGCGCGG CCGCCGTGAA 
13441 CCCCGAGTAT TTCACCAATG CCATCTTGAA CCCGCACTGG CTACCGCCCC CTGGTTTCTA 
13501 CACCGGGGGA TTCGAGGTGC CCGAGGGTAA CGATGGATTC CTCTGGGACG ACATAGACGA 
13561 CAGCGTGTTT TCCCCGCAAC CGCAGACCCT GCTAGAGTTG CAACAACGCG AGCAGGCAGA 
13621 GGCGGCGCTG GGAAAGGAAA GCTTCCGCAG GCCAAGCAGC TTGTCCGATC TAGGCGCTGC 
13681 GGCCCCGCGG TCAGATGCTA GTAGCCCATT TCCAAGCTTG ATAGGGTCTC TTACCAGCAC 
13741 TCGCACCACC CGCCCGCGCC TGCTGGGCGA GGAGGAGTAC CTAAACAACT CGC TGCTGC A 
13801 GCCGCAGCGC GAAAAGAACC TGCCTCCGGC GTTTCCCAAC AACGGGATAG AGAGCCTAGT 
13861 GGACAAGATG AGTAGATGGA AGACGTATGC GCAGG AGCAC AGGGATGTGC CCGGCCCGCG 
13921 CCCGCCCACC CGTCGTCAAA GGCACGACCG TCAGCGGGGT CTGGTGTGGG AGGACGATGA 
13981 CTCGGCAGAC GACAGCAGCG TCTTGGATTT GGGAGGGAGT GGCAACCCGT TTGCACACCT 
14041 TCGCCCCAGG CTGGGGAGAA TGTTTTAAAA AAAGCATGAT GCAAAATAAA AAACTCACCA 
14101 AGGCCATGGC ACCGAGCGTT GGTTTTCTTG TATTCCCCTT AGTATGCGGC GCGCGGCGAT 
14161 GTATGAGGAA GGTCCTCCTC CCTCCTACGA GAGCGTGGTG AGCGCGGCGC CAGTGGCGGC 
14221 GGCGCTGGGT TCACCCTTCG ATGCTCCCCT GGACCCGCCG TTCGTGCCTC CGCGGTACCT 
14281 GCGGCCTACC GGGGGGAGAA ACAGCATCCG TTACTCTGAG TTGGCACCCC TATTCGACAC 
14341 CACCCGTGTG TACCTTGTGG ACAACAAGTC AACGGATGTG GCATCCCTGA ACTACCAGAA 
14401 CGACCACAGC AACTTTCTAA CCACGGTCAT TCAAAACAAT GACTACAGCC CGGGGGAGGC 
14461 AAGCACACAG ACCATCAATC TTGACGACCG GTCGCACTGG GGCGGCGACC TGAAAACCAT 
14521 CCTGCATACC AACATGCCAA ATGTGAACGA GTTCATGTTT ACCAATAAGT TTAAGGCGCG 
14581 GGTGATGGTG TCGCGCTCGC TTACTAAGGA CAAACAGGTG GAGCTGAAAT ACGAGTGGGT 

. 14641 GGAGTTCACG CTGCCCGAGG GCAACTACTC CGAGACCATG ACCATAGACC TTATGAACAA 
14701 CGCGATCGTG GAGC AC TACT TGAAAGTGGG CAGGCAGAAC GGGGTTCTGG AAAGCGACAT 
14761 CGGGGTAAAG TTTGACACCC GCAACTTCAG ACTGGGGTTT GACCCAGTCA CTGGTCTTGT 

- 14821 CATGCCTGGG GTATATACAA ACGAAGCCTT CCATCCAGAC ATCATTTTGC TGCCAGGATG 
i4881 CGGGGTGGAC TTCACCCACA GCCGCCTGAG CAACTTGTTG GGCATCCGCA AGCGGCAACC 
14941 CTTCCAGGAG GGCTTTAGGA TCACCTACGA TGACCTGGAG GGTGGTAACA TTCCCGCACT. 
15001 GTTGGATGTG GACGCCTACC AGGCAAGCTT GAAAGATGAC ACCGAACAGG GCGGGGGTGG 
15061 CGCAGGCGGC GGCAACAACA GTGGCAGCGG CGCGGAAGAG AACTCCAACG CGGCAGCTGC 
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15121 GGCAATGCAG CCGGTGGAGG ACATGAACGA TCATGCCATT CGCGGCGACA CCTTTGCCAC 
15181 ACGGGCGGAG GAGAAGCGCG CTGAGGCCGA GGCAGCGGCC GAAGCTGCCG CCCCCGCTGC 
15241 GGAGGCTGCA CAACCCGAGG TCGAGAAGCC TCAGAAGAAA CCGGTGATTA AACCCCTGAC 
15301 AGAGGACAGC AAGAAACGCA GTTACAACCT AATAAGCAAT GACAGCACCT TCACCCAGTA 
15361 CCGCAGCTGG TACCTTGCAT ACAACTACGG CGACCCTCAG GCCGGGATCC GCTCATGGAC 
15421 CCTGCTTTGC ACTCCTGACG TAACCTGCGG CTCGGAGCAG GTATACTGGT CGTTGCCCGA 
15481 CATGATGCAA GACCCCGTGA CCTTCCGCTC CACGCGCCAG ATCAGCAACT TTCCGGTGGT 
15541 GGGCGCCGAG CTGTTGCCCG TGCACTCCAA GAGCTTCTAC AACGACCAGG CCGTCTACTC 
15601 CCAGCTCATC CGCCAGTTTA CCTCTCTGAC CCACGTGTTC AATCGCTTTC CCGAGAACCA 
15661 GATTTTGGCG CGCCCGCCAG CCCCCACCAT CACCACCGTC AGTGAAAACG TTCCTGCTCT 
15721 CACAGATCAC GGGACGCTAC CGCTGCGCAA CAGCATCGGA GGAGTCCAGC GAGTGACCAT 
15781 TACTGACGCC AGACGCCGCA CCTGCCCCTA CGTTTACAAG GCCCTGGGCA TAGTCTCGCC 
15841 GCGCGTCCTA TCGAGCCGCA CTTTTTGAGC AAGCATGTCC ATCCTTATAT CGCCCAGCAA 
15901 TAACACAGGC TGGGGCCTGC GCTTCCCAAG CAAGATGTTT GGCGGGGCCA AGAAGCGCTC 
15961 CGACCAACAC CCAGTGCGCG TGCGCGGGCA CTACCGCGCG CCCTGGGGCG CGCACAAACG 
16021 CGGCCGCACT GGGCGCACCA CCGTCGATGA CGCCATCGAC GCGGTGGTGG AGGAGGCGCG 
16081 CAACTACACG CCCACGCCGC CGCCAGTGTC CACCGTGGAC GCGGCCATTC AGACCGTGGT 
16141 GCGCGGAGCC CGGCGCTACG CTAAAATGAA GAGACGGCGG AGGCGCGTAG CACGTCGCCA 
16201 CCGCCGCCGA CCCGGCACTG CCGCCCAACG CGCGGCGGCG GCCCTGCTTA ACCGCGCACG 
16261 TCGCACCGGC CGACGGGCGG CCATGCGAGC CGCTCGAAGG CTGGCCGCGG GTATTGTCAC 
16321 TGTGCCCCCC AGGTCCAGGC GACGAGCGGC CGCCGCAGCA GCCGCGGCCA TTAGTGCTAT 
16381 GACTCAGGGT CGCAGGGGCA ACGTGTACTG GGTGCGCGAC TCGGTTAGCG GCCTGCGCGT 
16441 GCCCGTGCGC ACCCGCCCCC CGCGCAACTA GATTGCAATA AAAAACTACT TAGACTCGTA 
16501 CTGTTGTATG TATCCAGCGG CGGCGGCGCG CATCGAAGCT ATGTCCAAGC GCAAAATCAA 
16561 AGAAGAGATG CTCCAGGTCA TCGCGCCGGA GATCTATGGC CCCCCGAAGA AGGAAGAGCA 
16621 GGATTACAAG CCCCGAAAGC TAAAGCGGGT CAAAAAGAAA AAGAAAGATG ATGATGATGA 
16681 TGAACTTGAC GACGAGGTGG AACTGTTGCA CGCGACCGCG CCCAGGCGAC GGGTAC AGTG 
16741 GAAAGGTCGA CGCGTAAGAC GTGTTTTGCG ACCCGGCACC ACCGTAGTCT TTACGCCCGG 
16801 TGAGCGCTCC ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG ACGAGGACCT 
16861 GCTTGAGCAG GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 
16921 GCTGGCGTTG CCGCTGGACG AGGGCAACCC AACACCTAGC CTAAAGCCCG TGACACTGCA 
16981 GCAGGTGCTG CCCGCGCTTG CACCGTCCGA AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 
17041 TGACTTGGCA CCCACCGTGC AGCTGATGGT ACCCAAGCGT CAGCGACTGG AAGATGTCTT 
17101 GGAAAAAATG ACCGTGGAGC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 
17161 GGTGGCACCG GGACTGGGCG TGCAGACCGT GGACGTTCAG ATACCCACCA CCAGTAGCAC 
17221 TAGTATTGCC ACTGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTG CCTCGGCGGT 
17281 GGCAGATGCC GCGGTGCAGG CGGCCGCTGC GGCCGCGTCC AAGACCTCTA CGGAGGTGCA 
17341 AACGGACCCG TGGATGTTTC GTGTTTCAGC CCCCCGGCGT CCGCGCCGTT CAAGGAAGTA 
17401 CGGCGCCGCC AGCGCGCTAC TGCCCGAATA TGCCCTACAT CCTTCCATCG CGCCTACCCC 
17461 CGGCTATCGT GGCTACACCT ACCGCCCCAG AAGACGAGCA ACTACCCGAC GCCGAACCAC 
17521 CACTGGAACC CGGCGCCGCC GTCGCCGTCG CCAGCCCGTG CTGGCCCCGA TTTCCGTGCG 
17581 CAGGGTGGCT CGCGAAGGAG GCAGGACCCT GGTGCTGCCA ACAGCGCGCT ACCACCCCAG 
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17641 CATCGTTTAA AAGCCGGTCT TTGTGGTTCT TGCAGATATG GCCCTCACCT GCCGCCTCCG 
17701 TTTCCCGGTG CCGGGATTCC GAGGAAGAAT GCACCGTAGG AGGGGCATGG CCGGCCACGG 
17761 CCTGACGGGC GGCATGCGTC GTGCGCACCA CCGGCGGCGG CGCGCGTCGC ACCGTCGCAT 
17821 GCGCGGCGGT ATCCTGCCCC TCCTTATTCC ACTGATCGCC GCGGCGATTG GCGCCGTGCC 
17881 CGGAATTGCA TCCGTGGCCT TGCAGGCGCA GAGACACTGA TTAAAAACAA GTTACATGTG 
17941 GAAAAATCAA AATAAAAGTC TGGACTCTCA CGCTCGCTTG GTCCTGTAAC TATTTTGTAG 
18001 AATGGAAGAC ATCAACTTTG CGTCACTGGC CCCGCGACAC GGCTCGCGCC CGTTCATGGG 
18061 AAACTGGCAA GATATCGGCA CCAGCAATAT GAGCGGTGGC GCCTTCAGCT GGGGCTCGCT 
18121 GTGGAGCGGC ATTAAAAATT TCGGTTCCGC CGTTAAGAAC TATGGCAGCA AAGCCTGGAA 
18181 CAGCAGCACA GGCCAGATGC TGAGGGACAA GTTGAAAGAG CAAAATTTCC AACAAAAGGT 
. 18241 GGTAGATGGC CTGGCCTCTG GCATTAGCGG GGTGGTGGAC . CTGGCCAACC AGGCAGTGCA 
18301 AAATAAGATT AACAGTAAGC TTGATCCCCG CCCTCCCGTA GAGGAGCCTC CACCGGCCGT 
18361 GGAGACAGTG TCTCCAGAGG GGCGTGGCGA AAAGCGTCCG CGACCCGACA GGGAAGAAAC. 
18421 TCTGGTGACG CAAATAGACG AGCCTCCCTC GTACGAGGAG GCACTAAAGC AAGGCCTGCC 
18481 CACCACCCGT CCCATCGCGC CCATGGCTAC CGGAGTGCTG GGCCAGCACA CACCCGTAAC 

* 18541 GCTGGACCTG CCTCCCCCCG CCGACACCCA GCAGAAACCT GTGCTGCCAG GCCCGTCCGC 
i8601 CGTTGTTGTA ACCCGTCCTA GCCGCGCGTC CCTGCGCCGC GCCGCCAGCG GTCCGCGATC 
18661 GTTGCGGCCC GTAGCCAGTG GCAACTGGCA AAGCACACTG AACAGCATCG TGGGTTTGGG. 
18721 GGTGC AATCC CTGAAGCGCC GACGATGCTT CTGATAGCTA ACGTGTCGTA TGTGTGTCAT 
18781 GTATGCGTCC ATGTCGCCGC CAGAGGAGCT GCTGAGCCGC CGCGCGCCCG CTTTCCAAGA 
18841 TGGCTACCCC TTCGATGATG CCGCAGTGGT CTTACATGCA CATCTCGGGC CAGGACGCCT 
18901 CGGAGTACCT GAGCCCCGGG CTGGTGCAGT TCGCCCGCGC CACCGAGACG TACTTCAGCC 

• * 18961 TGAATAACAA GTTTAGAAAC CCCACGGTGG CGCCTACGCA CGACGTGACC ACAGACCGGT 

19021 CTCAGCGTTT GACGCTGCGG TTCATCCCCG TGGACCGCGA GGATACTGCG TACTCGTACA 
19081 AGGCGCGGTT CACCCTAGCT GTGGGTGATA ACCGTGTGCT AGACATGGCT TCCACGTACT 
19141 TTGACATCCG CGGCGTGGTG GACAGGGGCC CTACTTTTAA GCCCTACTCT GGCACTGCCT 
• 19201 ACAACGCACT GGCCCCCAAG GGTGCCCCCA ACTCGTGCGA GTGGGAACAA AATGAAACTG 
19261 CACAAGTGGA TGCTCAAGAA CTTGACGAAG AGGAGAATGA AGCCAATGAA GCTCAGGCGC 
19321 GAGAACAGGA ACAAGCTAAG AAAACCCATG TATATGCCCA GGCTGCACTG TCCGGAATAA 
19381 AAATAACTAA AGAAGGTCTA CAAATAGGAA CTGCCGACGC CACAGTAGCA GGTGCCGGCA 
19441 AAGAAATTTT CGCAGACAAA ACTTTTCAAC CTGAACCACA AGTAGGAGAA TCTCAATGGA 
19501 ACGAAGCGGA TGCCACAGCA GCTGGTGGAA GGGTTCTTAA AAAGACAACT CCCATGAAAC. 
19561 CCTGCTATGG CTCATACGCT AGACCCACCA ATTCCAACGG CGGACAGGGC GTTATGGTTG 
19621 AACAAAATGG TAAATTGGAA AGTCAAGTCG AAATGCAATT TTTTTCCACA TGCACAAATG 
19681 CCACAAATGA AGTTAACAAT ATACAACCAA CAGTTGTATT GTACAGCGAA GATGTAAACA 
19741 TGGAAACTCC AGATACTCAT CTTTCTTATA AACCTAAAAT GGGGGATAAA AATGCCAAAG 
19801 TCATGCTTGG ACAACAAGCA ATGCCAAACA GACCAAATTA CATTGCTTTT AGAGACAATT 
.19861 TTATTGGTCT CATGTATTAC AACAGCACAG GTAACATGGG TGTCCTTGCT GGTCAGGCAT 
19921 CGCAGTTGAA CGCTGTTGTA GATTTGCAAG ACAGAAACAC AGAGCTGTCC TACCAGCTTT 
19981 TGCTTGATTC AATTGGCGAC AGAACAAGAT ACTTTTCAAT GTGGAATCAA GCTGTTGACA 
20041 GCTATGATCC AGATGTCAGA ATTATTGAGA ACCATGGAAC TGAGGATGAG TTGCCAAATT 
20101 ATTGCTTTCC TCTTGGTGGA ATTGGGATTA CTGACACTTT TCAAGCTGTT AAAACAACTG 
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20161 CTGCTAACGG GGACCAAGGC AATACTACCT GGCAAAAAGA TTCAACATTT GCAGAACGCA 
20221 ATGAAATAGG GGTGGGAAAT AACTTTGCCA TGGAAATTAA CCTGAATGCC AACCTATGGA 
20281 GAAATTTCCT TTACTCCAAT ATTGCGCTGT ACCTGCCAGA CAAGCTAAAA TACAACCCCA 
20341 CCAATGTGGA AATATCTGAC AACCCCAACA CCTACGACTA CATGAACAAG CGAGTGGTGG 
20401 CTCCTGGGCT TGTAGACTGC TACATTAACC TTGGGGCGCG CTGGTCTCTG GACTACATGG 
20461 ACAACGTTAA TCCCTTTAAC CACCACCGCA ATGCGGGCCT GCGTTACCGC TCCATGTTGT 
20521 TGGGAAACGG CCGCTACGTG CCCTTTCACA TTCAGGTGCC CCAAAAGTTT TTTGCCATTA 
20581 AAAACCTCCT CCTCCTGCCA GGCTCATACA CATATGAATG GAACTTCAGG AAGGATGTTA 
20641 ACATGGTTCT GCAGAGCTCT CTGGGAAACG ACCTTAGAGT TGACGGGGCT AGCATTAAGT 
20701 TTGACAGCAT TTGTCTTTAC GCCACCTTCT TCCCCATGGC CCACAACACG GCCTCCACGC 
20761 TGGAAGCCAT GCTCAGAAAT GACACCAACG ACCAGTCCTT TAATGACTAC CTTTCCGCCG 
20821 CCAACATGCT ATATCCCATA CCCGCCAACG CCACCAACGT GCCCATCTCC ATCCCATCGC 
20881 GCAACTGGGC AGCATTTCGC GGTTGGGCCT TCACACGCTT GAAGACAAAG GAAACCCCTT 
20941 CCCTGGGATC AGGCTACGAC CCTTACTACA CCTACTCTGG CTCCATACCA TACCTTGACG 
21001 GAACCTTCTA TCTTAATCAC ACCTTTAAGA AGGTGGCCAT TACTTTTGAC TCTTCTGTTA 
21061 GCTGGCCGGG CAACGACCGC CTGCTTACTC CCAATGAGTT TGAGATTAAG CGCTCAGTTG 
21121 ACGGGGAGGG CTATAACGTA GCTCAGTGCA ACATGACAAA GGACTGGTTC CTAGTGCAGA 
21181 TGTTGGCCAA CTACAATATT GGCTACCAGG GCTTCTACAT TCCAGAAAGC TACAAAGACC 
21241 GCATGTACTC GTTCTTCAGA AACTTCCAGC CCATGAGCCG GCAAGTGGTG GACGATACTA 
21301 AATACAAAGA TTATCAGCAG GTTGGAATTA TCCACCAGCA TAACAACTCA GGCTTCGTAG 
21361 GCTACCTCGC TCCCACCATG CGCGAGGGAC AAGCTTACCC CGCTAATGTT CCCTACCCAC 
21421 TAATAGGGAA AACCGCGGTT GATAGTATTA CCCAGAAAAA GTTTCTTTGC GACCGCACCC 
21481 TGTGGCGCAT CCCCTTCTCC AGTAACTTTA TGTCCATGGG TGCGCTCACA GACCTGGGCC 
21541 AAAACCTTCT CTACGCAAAC TCCGCCCACG CGCTAGACAT GACCTTTGAG GTGGATCCCA 
21601 TGGACGAGCC CACCCTTCTT TATGTTTTGT TTGAAGTCTT TGACGTGGTC CGTGTGCACC 
21661 AGCCGCACCG CGGCGTCATC GAGACCGTGT ACCTGCGCAC GCCCTTCTCG GCCGGCAACG 
21721 C C AC AAC ATA AAGAAGCAAG CAACATCAAC AACAGCTGCC GCCATGGGCT CCAGTGAGCA 
21781 GGAACTGAAA GCCATTGTCA AAGATCTTGG TTGTGGGCCA TATTTTTTGG GCACCTATGA 
21841 CAAGCGCTTC CCAGGCTTTG TTTCCCCACA CAAGCTCGCC TGCGCCATAG TTAACACGGC 
21901 CGGTCGCGAG ACTGGGGGCG TACACTGGAT GGCCTTTGCC TGGAACCCGC GCTCAAAAAC 
21961 ATGCTACCTC TTTGAGCCCT TTGGCTTTTC TGACCAACGT CTCAAGCAGG TTTACCAGTT 
22021 TGAGTACGAG TCACTCCTGC GCCGTAGCGC CATTGCCTCT TCCCCCGACC GCTGTATAAC 
22081 GCTGGAAAAG TCCACCCAAA GCGTGCAGGG GCCCAACTCG GCCGCCTGTG GCCTATTCTG 
22141 CTGCATGTTT CTCCACGCCT TTGCCAACTG GCCCCAAACT CCCATGGATC ACAACCCCAC 
22201 CATGAACCTT ATTACCGGGG TACCCAACTC CATGCTTAAC AGTCCCCAGG TACAGCCCAC 
22261 CCTGCGCCGC AACCAGGAAC AGCTCTACAG CTTCCTGGAG CGCCACTCGC CCTACTTCCG 
22321 CAGCCACAGT GCGCAAATTA GGAGCGCCAC TTCTTTTTGT CACTTGAAAA ACATGTAAAA 
22381 ATAATGTACT AGGAGACACT TTCAATAAAG GCAAATGTTT TTATTTGTAC ACTCTCGGGT 
22441 GATTATTTAC CCCCACCCTT GCCGTCTGCG CCGTTTAAAA ATCAAAGGGG TTCTGCCGCG 
22501 CATCGCTATG CGCCACTCGC AGGGACACGT TGCGATACTG GTGTTTAGTG CTCCACTTAA 
22561 ACTCAGGCAC AACCATCCGC GGCAGCTCGG TGAAGTTTTC ACTCCACAGG CTGCGCACCA 
22621 TCACCAACGC GTTTAGCAGG TCGGGCGCCG ATATCTTGAA GTCGCAGTTG GGGCCTCCGC 
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22681 CCTGCGCGCG CGAGTTGCGA TACACAGGGT TACAGCACTG GAACACTATC AGCGCCGGGT 
22741 GGTGCACGCT GGCCAGCACG CTCTTGTCGG AGATCAGATC CGCGTCCAGG TCCTCCGCGT 
22801 TGCTCAGGGC GAACGGAGTC AACTTTGGTA GCTGCCTTCC CAAAAAGGGT GCATGCCCAG 
22861 GCTTTGAGTT GCACTCGCAC CGTAGTGGCA TCAGAAGGTG ACCGTGCCCA GTCTGGGCGT 
22921 TAGGATACAG CGCCTGCATG AAAGCCTTGA TCTGCTTAAA AGCCACCTGA GCCTTTGCGC 
22981 CTTCAGAGAA GAACATGCCG CAAGACTTGC CGGAAAACTG ATTGGCCGGA CAGGCCGCGT. 
23041 CATGCACGCA GCACCTTGCG TCGGTGTTGG AGATCTGCAC CACATTTCGG CCCCACCGGT : 
23101 TCTTCACGAT CTTGGCCTTG CTAGACTGCT CCTTCAGCGC GCGCTGCCCG TTTTCGCTCG : 
23161 TGACATCCAT TTCAATCACG TGCTCCTTAT TTATCATAAT GCTCCCGTGT AGACACTTAA 
23221 GCTCGCCTTC GATCTCAGCG CAGCGGTGCA GCCACAACGC GCAGCCCGTG GGCTCGTGGT 
23281 GCTTGTAGGT TACCTCTGCA AACGACTGCA GGTACGCCTG CAGGAATCGC CCCATCATCG 
23341 TCACAAAGGT CTTGTTGCTG GTGAAGGTCA GCTGCAACCC GCGGTGCTCC TCGTTTAGCC 
23401 AGGTCTTGCA TACGGCCGCC AGAGCTTCCA CTTGGTCAGG CAGTAGCTTG AAGTTTGCCT 
23461 TTAGATCGTT ATCCACGTGG TACTTGTCCA TCAACGCGCG CGCAGCCTCC ATGCCCTTCT 
23521 CCCACGCAGA CACGATCGGC AGGCTCAGCG GGTTTATCAC CGTGCTTTCA CTTTCCGCTT 
23581 CACTGGACTC TTCCTTTTCC TCTTGCATCC GCATACCCCG CGCCACTGGG TCGTCTTCAT 
23641 TCAGCCGCCG CACCGTGCGC TTACCTCCCT TGCCGTGCTT GATTAGCACC GGTGGGTTGC 
23701. TGAAACCCAC CATTTGTAGC GCCACATCTT CTCTTTCTTC CTCGCTGTCC ACGATCACCT 
23761 CTGGGGATGG CGGGCGCTCG GGCTTGGGAG AGGGGCGCTT CTTTTTCTTT TTGGACGCAA 
23821 TGGCCAAATC CGCCGTCGAG GTCGATGGCC GCGGGCTGGG TGTGCGCGGC ACCAGCGCAT 
23881 CTTGTGACGA GTCTTCTTCG TCCTCGGACT CGAGACGCCG CCTCAGCCGC TTTTTTGGGG 
23941 GCGCGCGGGG AGGCGGCGGC GACGGCGACG GGGACGAGAC GTCCTCCATG GTTGGTGGAC 
24001 GTCGCGCCGC ACCGCGTCCG CGCTCGGGGG TGGTTTCGCG CTGCTCCTCT TCCCGACTGG 
24061 CCATTTCCTT CTCCTATAGG CAGAAAAAGA TCATGGAGTC AGTCGAGAAG GAGGACAGCC 
24121 TAACCGCCCC CTTTGAGTTC GCCACCACCG CCTCCACCGA TGCCGCCAAC GCGCCTACCA 
24181. CCTTCCCCGT CGAGGCACCC CCGCTTGAGG AGGAGGAAGT GATTATCGAG CAGGACCCAG 
24241 GTTTTGTAAG CGAAGACGAC GAAGATCGCT CAGTACCAAC AGAGGATAAA AAGCAAGACC 
24301 AGGACGACGC AGAGGCAAAC GAGGAACAAG TCGGGCGGGG GGACCAAAGG CATGGCGACT 
24361 ACCTAGATGT GGGAGACGAC GTGCTGTTGA AGCATCTGCA GCGCCAGTGC GCCATTATCT 
24421 GCGACGCGTT GCAAGAGCGC AGCGATGTGC CCCTCGCCAT AGCGGATGTC AGCCTTGCCT 
24481 ACGAACGCCA CCTGTTCTCA CCGCGCGTAC CCCCCAAACG CCAAGAAAAC GGCACATGCG 
24541 AGCCCAACCC GCGCCTCAAC TTCTACCCCG TATTTGCCGT GCCAGAGGTG CTTGCCACCT 
24601 ATCACATCTT TTTCCAAAAC TGCAAGATAC CCCTATCCTG CCGTGCCAAC CGCAGCCGAG 
24661 CGGACAAGCA GCTGGCCTTG CGGCAGGGCG CTGTCATACC TGATATCGCC TCGCTCGACG 
24721 AAGTGCCAAA AATCTTTGAG GGTCTTGGAC GCGACGAGAA GCGCGCGGCA AACGCTCTGC 
24781 AACAAGAAAA CAGCGAAAAT GAAAGTCACT GTGGAGTGCT GGTGGAACTT GAGGGTGACA 
24841 ACGCGCGCCT AGCCGTGCTG AAACGCAGCA TCGAGGTCAC CCACTTTGCC TACCCGGCAC 
24901 TTAACCTACC CCCCAAGGTT ATGAGCACAG TCATGAGCGA GCTGATCGTG CGCCGTGCAC 
24961 GACCCCTGGA GAGGGATGCA AACTTGCAAG AACAAACCGA GGAGGGCCTA CCCGCAGTTG 
25021 GCGATGAGCA GCTGGCGCGC TGGCTTGAGA CGCGCGAGCC TGCCGACTTG GAGGAGCGAC 
25081 GCAAGCTAAT GATGGCCGCA GTGCTTGTTA CCGTGGAGCT TGAGTGCATG CAGCGGTTCT 
25141 TTGCTGACCC GGAGATGCAG CGCAAGCTAG AGGAAACGTT GCACTACACC TTTCGCCAGG 
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25201 GCTACGTGCG CCAGGCCTGC AAAATTTCCA ACGTGGAGCT CTGCAACCTG GTCTCCTACC 
25261 TTGGAATTTT GCACGAAAAC CGCCTTGGGC AAAACGTGCT TCATTCCACG CTCAAGGGCG 
25321 AGGCGCGCCG CGACTACGTC CGCGACTGCG TTTACTTATT TCTGTGCTAC ACCTGGCAAA 
25381 CGGCCATGGG CGTGTGGCAG CAGTGCCTGG AGGAGCGCAA CCTGAAGGAG CTGCAGAAGC 
25441 TGCTAAAGCA AAACTTGAAG GACCTATGGA CGGCCTTCAA CGAGCGCTCC GTGGCCGCGC 
25501 ACCTGGCGGA CATTATCTTC CCCGAACGCC TGCTTAAAAC CCTGCAACAG GGTCTGCCAG 
25561 ACTTCACCAG TCAAAGCATG TTGCAAAACT TTAGGAACTT TATCCTAGAG CGTTCAGGAA 
25621 TTCTGCCCGC CACCTGCTGT GCGCTTCCTA GCGACTTTGT GCCCATTAAG TACCGTGAAT 
25681 GCCCTCCGCC GCTTTGGGGT CACTGCTACC TTCTGCAGCT AGCCAACTAC CTTGCCTACC 
25741 ACTCCGACAT CATGGAAGAC GTGAGCGGTG ACGGCCTACT GGAGTGTCAC TGTCGCTGCA 
25801 ACCTATGCAC CCCGCACCGC TCCCTGGTCT GCAATTCACA ACTGCTTAGC GAAAGTCAAA 
25861 TTATCGGTAC CTTTGAGCTG CAGGGTCCCT CGCCTGACGA AAAGTCCGCG GCTCCGGGGT 
25921 TGAAACTCAC TCCGGGGCTG TGGACGTCGG CTTACCTTCG CAAATTTGTA CCTGAGGACT 
25981 ACCACGCCCA CGAGATTAGG TTCTACGAAG ACCAATCCCG CCCGCCAAAT GCGGAGCTTA 
26041 CCGCCTGCGT CATTACCCAG GGCCACATCC TTGGCCAATT GCAAGCCATT AACAAAGCCC 
26101 GCCAAGAGTT TCTGCTACGA AAGGGACGGG GGGTTTACTT GGACCCCCAG TCCGGCGAGG 
26161 AGCTCAACCC AATCCCCCCG CCGCCGCAGC CCTATCAGCA GCCGCGGGCC CTTGCTTCCC 
26221 AGGATGGCAC CCAAAAAGAA GCTGCAGCTG CCGCCGCCGC CACCCACGGA CGAGGAGGAA 
2 6281 TACTGGGACA GTCAGGCAGA GGAGGTTTTG GACGAGGAGG AGGAGATGAT GGAAGACTGG 
26341 GACAGCCTAG ACGAGGAAGC TTCCGAGGCC GAAGAGGTGT CAGACGAAAC ACCGTCACCC 
2 6401 TCGGTCGCAT TCCCCTCGCC GGCGCCCCAG AAATCGGCAA CCGTTCCCAG CATTGCTACA 
2 6461 ACCTCCGCTC CTCAGGCGCC GCCGGCACTG CCCGTTCGCC GACCCAACCG TAGATGGGAC 
2 6521 ACCACTGGAA CCAGGGCCGG TAAGTCTAAG CAGCCGCCGC CGTTAGCCCA AGAGCAACAA 
2 6581 CAGCGCCAAG GCTACCGCTC GTGGCGCGTG CACAAGAACG CCATAGTTGC TTGCTTGCAA 
26641 GACTGTGGGG GCAACATCTC CTTCGCCCGC CGCTTTCTTC TCTACCATCA CGGCGTGGCC 
2 6701 TTCCCCCGTA ACATCCTGCA TTACTACCGT CATCTCTACA GCCCCTACTG CACCGGCGGC 
26761 AGCGGCAGCA ACAGCAGCGG CCACGCAGAA GCAAAGGCGA CCGGATAGCA AGACTCTGAC 
26821 AAAGCCCAAG AAATCCACAG CGGCGGCAGC AGCAGGAGGA GGAGCACTGC GTCTGGCGCC 
26881 CAACGAACCC GTATCGACCC GCGAGCTTAG AAACAGGATT TTTCCCACTC TGTATGCTAT 
2 6941 ATTTCAACAG AGCAGGGGCC AAGAACAAGA GCTGAAAATA AAAAACAGGT CTCTGCGCTC 
27001 CCTCACCCGC AGCTGCCTGT ATCACAAAAG CGAAGATCAG CTTCGGCGCA CGCTGGAAGA 
27061 CGCGGAGGCT CTCTTCAGCA AATACTGCGC GCTGACTCTT AAGGACTAGT TTCGCGCCCT 
27121 TTCTCAAATT TAAGCGCGAA AAC TACGTC A TCTCCAGCGG CCACACCCGG CGCCAGCACC 
27181 TGTCGTCAGC GCCATTATGA GCAAGGAAAT TCCCACGCCC TACATGTGGA GTTACCAGCC 
27241 ACAAATGGGA CTTGCGGCTG GAGCTGCCCA AGACTACTCA ACCCGAATAA ACTACATGAG 
273 01 CGCGGGACCC CACATGATAT CCCGGGTCAA CGGAATCCGC GCCCACCGAA ACCGAATTCT 
273 61 CCTCGAACAG GCGGCTATTA CCACCACACC TCGTAATAAC CTTAATCCCC GTAGTTGGCC 
27421 CGCTGCCCTG GTGTACCAGG AAAGTCCCGC TCCCACCACT GTGGTACTTC CCAGAGACGC 
27481 CCAGGCCGAA GTTCAGATGA CTAACTCAGG GGCGCAGCTT GCGGGCGGCT TTCGTCACAG 
27541 GGTGCGGTCG CCCGGGCAGG GTATAACTCA CCTGAAAATC AGAGGGCGAG GTATTCAGCT 
27601 CAACGACGAG TCGGTGAGCT CCTCTCTTGG TCTCCGTCCG GACGGGACAT TTCAGATCGG 
27661 CGGCGCTGGC CGCTCTTCAT TTACGCCCCG TCAGGCGATC CTAACTCTGC AGACCTCGTC 
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27721 CTCGGAGCCG CGCTCCGGAG GCATTGGAAC TCTACAATTT ATTGAGGAGT TCGTGCCTTC ; 
27781 GGTTTACTTC AACCCCTTTT CTGGACCTCC CGGCCACTAC CCGGACCAGT TTATTCCCAA. 
27841 CTTTGACGCG GTAAAAGACT CGGCGGACGG CTACGACTGA ATGACCAGTG GAGAGGCAGA , 
27901 GCAACTGCGC CTGACACACC TCGACCACTG CCGCCGCCAC AAGTGCTTTG CCCGCGGCTC ' 
27961 CGGTGAGTTT TGTTACTTTG AATTGCCCGA AGAGCATATC GAGGGCCCGG CGCACGGCGT 
28021 CCGGCTCACC ACCCAGGTAG AGCTTACACG TAGCCTGATT CGGGAGTTTA CCAAGCGCCC : 
28081 CCTGCTAGTG GAGCGGGAGC GGGGTCCCTG TGTTCTGACC GTGGTTTGCA ACTGTCCTAA 
; 28141 CCCTGGATTA CATCAAGATC TTTGTTGTCA TCTCTGTGCT GAGTATAATA AATAC AG AAA 
28201 TTAGAATCTA CTGGGGCTCC TGTCGCCATC CTGTGAACGC CACCGTTTTT ACCCACCCAA 
28261 AGCAGACCAA AGCAAACCTC ACCTCCGGTT TGCACAAGCG GGCCAATAAG TACCTTACCT 
28321 GGTACTTTAA CGGCTCTTCA TTTGTAATTT ACAACAGTTT CCAGCGAGAC GAAGTAAGTT 
28381 TGCCACACAA CCTTCTCGGC TTCAACTACA CCGTCAAGAA AAACACCACC ACCACCCTCC 
28441 TCACCTGCCG GGAACGTACG. AGTGCGTCAC CGGTTGCTGC GCCCACACCT ACAGCCTGAG 
28501 CGTAACCAGA CATTACTCCC ATTTTCCCAA AACAGGAGGT GAGCTCAACT CCCGGAACTC . 
28561 AGGTCAAAAA AGCATTTTGC GGGGTGCTGG GATTTTTTAA TTAAGTATAT GAGCAATTCA 
28621 AGTAACTCTA CAAGCTTGTC TAATTTTTCT GGAATTGGGG TCGGGGTTAT CCTTACTCTT 
28681 GTAATTCTGT TTATTCTTAT ACTAGCACTT CTGTGCCTTA GGGTTGCCGC CTGCTGCApG 
28741 CACGTTTGTA CCTATTGTCA GCTTTTTAAA CGCTGGGGGC GACATCCAAG ATGAGGTACA 
28801 TGATTTTAGG CTTGCTCGCC CTTGCGGCAG TCTGCAGCGC TGCCAAAAAG GTTGAGTTTA 
28861 AGGAACCAGC TTGCAATGTT ACATTTAAAT CAGAAGCTAA TGAATGCACT ACTCTTATAA 1 
28921 AATGC AC C AC AGAACATGAA AAGCTTATTA TTCGCCACAA AGACAAAATT GGC AAGTATG 
28981 CTGTATATGC TATTTGGCAG CCAGGTGACA CTAACGACTA TAATGTCACA GTCTTCCAAG 
29041 GTGAAAATCG TAAAACTTTT ATGTATAAAT TTCCATTTTA TGAAATGTGC GATATTACCA 
29101 TGTACATGAG CAAACAGTAC AAGTTGTGGC CCCCACAAAA GTGTTTAGAG AACACTGGCA 
29161 CCTTTTGTTC CACCGCTCTG CTTATTACAG CGCTTGCTTT GGTATGTACC TTACTTTATC 
29221 TCAAATACAA AAGCAGACGC AGTTTTATTG ATGAAAAGAA AATGCCTTGA TTTTCCGCTT 
29281 GCTTGTATTC CCCTGGACAA TTTACTCTAT GTGGGATATG CGCCAGGCGG GAAAGATTAT 
29341 ACCCACAACC TTCAAATCAA ACTTTCCTGG ACGTTAGCGC CTGACTTCTG CCAGCGCCTG 
29401 CACTGCAAAT TTGATC AAAC . CCAGCTTCAG CTTGCCTGCT CCAGAGATGA CCGGCTCAAC 
29461 CATCGCGCCC ACAACGGACT ATCGCAACAC CACTGCTACC GGACTAAAAT CTGCCCTAAA . 
29521 TTTACCCCAA GTTCATGCCT TTGTCAATGA CTGGGCGAGC TTGGGCATGT GGTGGTTTTC 
29581 CATAGCGCTT ATGTTTGTTT GCCTTATTAT TATGTGGCTT ATTTGTTGCC TAAAGCGCAG 
. 29641 ACGCGCCAGA CCCCCCATCT ATAGGCCTAT CATTGTGCTC AACCCACACA ATGAAAAAAT 
29701 TCATAGATTG GACGGTCTCA AACCATGTTC TCTTCTTTTA CAGTATGATT AAATGAGACA 
29761 TGATTCCTCG AGTCCTTATA TTATTGACCC TTGTTGCGCT TTTCTGTGCG TGCTCTACAT 
29821 TGGCTGCGGT CGCTCACATC GAAGTAGATT GCATCCCACC TTTCACAGTT TACCTGCTTT 
29881 ACGGATTTGT CACCCTTATC CTCATCTGCA GCCTCGTCAC TGTAGTCATC GCCTTCATTC 
29941 AGTTCATTGA CTGGATTTGT GTGCGCATTG CGTACCTTAG GCACCATCCG CAATACAGAG 
30001 ACAGGACTAT AGCTGATCTT CTCAGAATTC TTTAATTATG AAACGGATTG TC ACTTTTGT 
30061 TTTGCTGATT TTCTGCGCCC TACCTGTGCT TTGCTCCCAA ACCTCAGCGC CTCCCAAAAG 
30121 ACATATTTCC TGCAGATTCA CTCAAATATG GAACATTCCC AGCTGCTACA ACAAACAGAG- 
3*0181 CGATTTGTCA GAAGCCTGGT TATACGCCAT CATCTCTGTC ATGGTTTTTT GCAGTACCAT 
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30241 TTTTGCCCTA GCCATATACC CATACCTTGA CATTGGTTGG AATGCCATAG ATGCCATGAA 
30301 CCACCCTACT TTCCCAGCGC CCAATGTCAT ACCACTGCAA CAGGTTATTG CCCCAATCAA 
30361 TCAGCCTCGC CCCCCTTCTC CCACCCCCAC TGAGATTAGC TACTTTAATT TGACAGGTGG 
30421 AGATGACTGA ATCTCTAGAT CTAGAATTGG ATGGAATTAA CACCGAACAG CGCCTACTAG 
30481 AAAGGCGCAA GGCGGCGTCC GAGCGAGAAC GCCTAAAACA AGAAGTTGAA GACATGGTTA 
30541 ACCTGCACCA GTGTAAAAGA GGTATCTTTT GTGTGGTCAA GCAGGCCAAA CTTACCTACG 
30601 AAAAAACCAC TACCGGCAAC CGCCTTAGCT ACAAGCTACC CACCCAGCGC CAAAAACTGG 
30661 TGCTTATGGT GGGAGAAAAA CCTATCACCG TCACCCAGCA CTC GGCAGAA ACAGAAGGCT 
30721 GCCTGCACTT CCCCTATCAG GGTCCAGAGG ACCTCTGCAC TCTTATTAAA ACCATGTGTG 
30781 GCATTAGAGA TCTTATTCCA TTCAACTAAC AATAAACACA CAATAAATTA CTTACTTAAA 
30841 ATCAGTCAGC AAATCTTTGT CCAGCTTATT CAGCATCACC TCCTTTCCCT CCTCCCAACT 
30901 CTGGTATTTC AGCAGCCTTT TAGCTGCGAA CTTTCTCCAA AGTCTAAATG GGATGTCAAA 
30961 TTCCTCATGT TCTTGTCCCT CCGCACCCAC TATCTTCATA TTGTTGCAGA TGAAACGCGC 
31021 CAGACCGTCT GAAGACACCT TCAACCCTGT GTACCCATAT GACACGGAAA CCGGCCCTCC 
31081 AACTGTGCCT TTCCTTACCC CTCCCTTTGT GTCGCCAAAT GGGTTCCAAG AAAGTCCCCC 
31141 CGGAGTGCTT TCTTTGCGTC TTTCAGAACC TTTGGTTACC TCACACGGCA TGCTTGCGCT 
31201 AAAAATGGGC AGCGGCCTGT CCCTGGATCA GGCAGGCAAC CTTACATCAA ATACAATCAC 
31261 TGTTTCTCAA CCGCTAAAAA AAACAAAGTC CAATATAACT TTGGAAACAT CCGCGCCCCT 
31321 TACAGTCAGC TCAGGCGCCC TAACCATGGC CACAACTTCG CCTTTGGTGG TCTCTGACAA 
31381 CACTCTTACC ATGCAATCAC AAGCACCGCT AACCGTGCAA GACTCAAAAC TTAGCATTGC 
31441 TACCAAAGAG CCACTTACAG TGTTAGATGG AAAACTGGCC CTGCAGACAT CAGCCCCCCT 
31501 CTCTGCCACT GATAACAACG CCCTCACTAT CACTGCCTCA CCTCCTCTTA CTACTGCAAA 
31561 TGGTAGTCTG GCTGTTACCA TGGAAAACCC ACTTTACAAC AACAATGGAA AACTTGGGCT 
31621 CAAAATTGGC GGTCCTTTGC AAGTGGCCAC CGACTCACAT GCACTAACAC TAGGTACTGG 
31681 TCAGGGGGTT GCAGTTCATA ACAATTTGCT ACATACAAAA GTTACAGGCG CAATAGGGTT 
31741 TGATACATCT GGCAACATGG AACTTAAAAC TGGAGATGGC CTCTATGTGG ATAGCGCCGG 
31801 TCCTAACCAA AAACTACATA TTAATCTAAA TACCACAAAA GGCCTTGCTT TTGACAACAC 
31861 CGCAATAACA ATTAACGCTG GAAAAGGGTT GGAATTTGAA ACAGACTCCT CAAACGGAAA 
31921 TCCCATAAAA ACAAAAATTG GATCAGGCAT ACAATATAAT ACCAATGGAG CTATGGTTGC 
31981 AAAACTTGGA ACAGGCCTCA GTTTTGACAG CTCCGGAGCC ATAACAATGG GCAGCATAAA 
32041 CAATGACAGA CTTACTCTTT GGACAACACC AGACCCATCC CCAAATTGCA GAATTGCTTC 
32101 AGATAAAGAC TGCAAGCTAA CTCTGGCGCT AACAAAATGT GGCAGTCAAA TTTTGGGCAC 
32161 TGTTTCAGCT TTGGCAGTAT CAGGTAATAT GGCCTCCATC AATGGAACTC TAAGCAGTGT 
32221 AAACTTGGTT CTTAGATTTG ATGACAACGG AGTGCTTATG TCAAATTCAT CACTGGACAA 
32281 ACAGTATTGG AACTTTAGAA ACGGGGACTC CACTAACGGT CAACCATACA CTTATGCTGT 
32341 TGGGTTTATG CCAAACCTAA AAGCTTACCC AAAAACTCAA AGTAAAACTG CAAAAAGTAA 
32401 TATTGTTAGC CAGGTGTATC TTAATGGTGA CAAGTCTAAA CCATTGCATT TTACTATTAC 
32461 GCTAAATGGA ACAGATGAAA CCAACCAAGT AAGCAAATAC TCAATATCAT TCAGTTGGTC 
32521 CTGGAACAGT GGACAATACA CTAATGACAA ATTTGCCACC AATTCCTATA CCTTCTCCTA 
32581 CATTGCCCAG GAATAAAGAA TCGTGAACCT GTTGCATGTT ATGTTTCAAC GTGTTTATTT 
32641 TTCAATTGCA GAAAATTTCA AGTCATTTTT CATTCAGTAG TATAGCCCCA CCACCACATA 
32701 GCTTATACTA ATCACCGTAC CTTAATCAAA CTCACAGAAC CCTAGTATTC AACCTGCCAC 
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32761 CTCCCTCCCA ACACACAGAG TACACAGTCC TTTCTCCCCG GCTGGCCTTA AACAGCATCA 
32821 TATCATGGGT AACAGACATA TTCTTAGGTG TTATATTCCA CACGGTCTCC TGTCGAGCCA 
32881 AACGCTCATC AGTGATGTTA ATAAACTCCC CGGGCAGCTC GCTTAAGTTC ATGTCGCTGT . 
32941 CCAGCTGCTG AGCCACAGGC TGCTGTCCAA CTTGCGGTTG CTCAACGGGC GGCGAAGGAG 
33001 AAGTCCACGC CTACATGGGG GTAGAGTCAT AATCGTGCAT CAGGATAGGG CGGTGGTGCT 
33061 GCAGCAGCGC GCGAATAAAC TGCTGCCGCG GCCGCTCCGT CCTGCAGGAA TACAACATGG 
33i21 CAGTGGTCTC CTCAGCGATG ATTCGCACCG CCCGCAGCAT AAGGCGCCTT GTCCTCCGGG 
33181 CACAGCAGCG CACCCTGATC TCACTTAAGT CAGCACAGTA ACTGCAGCAC AGTACCACAA - 
33241 TATTGTTTAA AATCCCACAG TGCAAGGCGC TGTATCCAAA GCTCATGGCG GGGACCACAG 
33301 AACCCACGTG GCCATCATAC CACAAGCGCA GGTAGATTAA GTGGCGACCC CTCATAAACA 
33361 CGCTGGACAT AAACATTACC TCTTTTGGCA TGTTGTAATT CACCACCTCC CGGTACCATA 
33421 TAAACCTCTG ATTAAACATG GCGCCATCCA CCACCATCCT AAACCAGCTG GCCAAAACCT 
33481 GCCCGCCGGC TATGCACTGC AGGGAACCGG GACTGGAACA ATGACAGTGG AGAGCCCAGG 
33541 ACTCGTAACC ATGGATCATC ATGCTCGTCA TGATATCAAT GTTGGCACAA CACAGGCACA 
33601 CGTGCATACA CTTCCTCAGG ATTACAAGCT CCTCCCGCGT CAGAACCATA TCCCAGGGAA 
33661 CAACCCATTC CTGAATCAGC GTAAATCCCA CACTGCAGGG AAGACCTCGC ACGTAACTCA 
33721 CGTTGTGCAT TGTCAAAGTG TTACATTCGG GCAGCAGCGG ATGATCCTCC AGTATGGTAG 
33781 CGCGTGTCTC TGTCTGAAAA GGAGGTAGGC GATCCCTACT GTACGGAGTG CGCCGAGACA 
33841 ACCGAGATCG TGTTGGTCGT AGTGTCATGC CAAATGGAAC GCCGGACGTA GTCATATTTC 
33901 CTGAAGCAAA ACCAGGTGCG GGCGTGACAA ACAGATCTGC GTCTCCGGTC TCGTC GCTTA 
33961 GCTCGCTCTG TGTAGTAGTT GTAGTATATC CACTCTCTCA AAGCATCCAG GCGCCCCCTG 
34021 GCTTCGGGTT CTATGTAAAC TCCTTCATGC GCCGCTGCCC TGATAACATC CACCACCGCA 
34081 GAATAAGCCA CACCCAGCCA ACCTACACAT TCGTTCTGCG AGTCACACAC GGGAGGAGCG 
34141 GGAAGAGCTG GAAGAACCAT GTTTTTTTTT TTTATTCCAA AAGATTATCC AAAACCTCAA 
.34201 AATGAAGATC TATTAAGTGA ACGCGCTCCC CTCCGGTGGC GTGGTCAAAC TCTACAGCCA 
34261 AAGAACAGAT AATGGCATTT GTAAGATGTT GCACAATGGC TTCCAAAAGG CAAACTGCCC 
34321 TCACGTCCAA GTGGACGTAA AGGCTAAACC CTTCAGGGTG AATCTCCTCT ATAAACATTC 
34381 CAGCACCTTC AACCATGCCC AAATAATTTT CATCTCGCCA CCTTATCAAT ATGTCTCTAA 
34441 GCAAATCCCG AATATTAAGT CCGGCCATTG TAAAAATCTG CTCCAGAGCG CCCTCCACCT 
34501 TCAGCCTCAA GCAGCGAATC ATGATTGCAA AAATTCAGGT TCCTCACAGA CCTGTATAAG 
34561 ATTCAAAAGC GGAACATTAA CAAAAATACC GCGATCCCGT AGGTCCCTTC GCAGGGCCAG 
34621 CTGAACATAA TCGTGCAGGT CTGCACGGAC CAGCGCGGCC ACTTCCCCGC CAGGAACCAT 
34681 GACAAAAGAA CCCACACTGA TTATGACACG CATACTCGGA GCTATGCTAA CCAGCGTAGC 
34741 CCCGATGTAA GCTTGTTGCA TGGGCGGCGA TATAAAATGC AAGGTACTGC TCAAAAAATC 
34801 AGGCAAAGCC TCGCGCAAAA AAGCAAGCAC ATCGTAGTCA TGCTCATGCA GATAAAGGCA 
34861 GGTAAGTTCC GGAACCACCA CAGAAAAAGA CACCATTTTT CtCTCAAACA TGTCTGC GGG 
34921 TTCCTGCATA AACACAAAAT AAAATAACAA AAAAAAAAAA ACATTTAAAC ATTAGAAGCC 
34981 TGTNTTACAA CAGGAAAAAC AACCCTTATA AGCATAAGAC GGACTACGGC CATGCCGGCG 
35041 TGACCGTAAA AAAACTGGTC ACCGTGATTA AAAAGCACCA CCGACAGTTC CTCGGTCATG 
35101 TCCGGAGTCA TAATGTAAGA CTCGGTAAAC ACATCAGGTT GGTTAACATC GGTCAGTGCT 
35161 AAAAAGCGAC CGAAATAGCC CGGGGGAATA CAtACCCGCA. GGCGTAGAGA CAACATTACA 
35221 GCCCCCATAG GAGGTATAAC AAAATTAATA GGAGAGAAAA ACACATAAAC ACCTGAAAAA 
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35281 CCCTCCTGCC TAGGCAAAAT AGCACCCTCC CGCTCCAGAA CAACATACAG CGCTTCCACA 
35341 GCGGCAGCCA TAACAGTCAG CCTTACCAGT AAAAAAACCT ATTAAAAAAC ACCACTCGAC 
35401 ACGGCACCAG CTCAATCAGT CACAGTGTAA AAAGGGCCAA GTACAGAGCG AGTATATATA 
35461 GGACTAAAAA ATGACGTAAC GGTTAAAGTC CACAAAAACC ACCCAGAAAA CCGCACGCGA 
35521 ACCTACGCCC AGAAACGAAA GCCAAAAAAC CCACAACTTC CTCAAATCTT CACTTCCGTT 
35581 TTCCCACGAT ACGTCACTTC CCATTTTAAA AAAAAACTAC AATTCCCAAT ACATGCAAGT 
35641 TACTCCGCCC TAAAACCTAC GTCACCCGCC CCGTTCCCAC GCCCCGCGCC ACGTCACAAA 
35701 CTCCACCCCC TCATTATCAT ATTGGCTTCA ATCCAAAATA AGGTATATTA TTGATGATG 
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1 CATCATCAAT AATATACCTT ATTTTGGATT GAAGCCAATA TGATAATGAG GGGGTGGAGT 
61 TTGTGACGTG GCGCGGGGCG TGGGAACGGG GCGGGTGACG TAGTAGTGTG GCGGAAGTGT* 
121 GATGTTGCAA GTGTGGCGGA ACACATGTAA GCGACGGATG TGGCAAAAGT GACGTTTTTG 
1 181 GTGTGCGCCG GTGTACACAG GAAGTGACAA TTTTCGCGCG GTTTTAGGCG GATGTTGTAG 
241 TAAATTTGGG CGTAACCGAG TAAGATTTGG CCATTTTCGC GGGAAAACTG AATAAGAGGA ; 
301 AGTGAAATCT GAATAATTTT GTGTTACTCA TAGCGCGTAA TATTTGTCTA GGGCCGCGGG 
361 GACTTTGACC GTTTACGTGG AGACTCGCCC AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 
421 CGGGTCAAAG TTGGCGTTTT ATTATTATAG TCAGCTGACG TGTAGTGTAT TTATACCCGG 
481 TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 
541 TCCGACACCG GGACTGAAAA TGAGACATAT TATCTGCCAC GGAGGTGTTA TTACCGAAGA 
601 AATGGCCGCC AGTCTTTTGG ACCAGCTGAT CGAAGAGGTA CTGGCTGATA ATCTTCCACC . 
661 TCCTAGCCAT TTTGAACCAC CTACCCTTCA CGAACTGTAT GATTTAGACG TGACGGCCCC 
721 CGAAGATCCC AACGAGGAGG CGGTTTCGCA GATTTTTCCC GACTCTGTAA TGTTGGCGGT 
781 GCAGGAAGGG ATTGACTTAC TCACTTTTCC GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA 
841 CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA GAGAGCCTTG GGTCCGGTTT CTATGCCAAA 
901 CCTTGTACCG GAGGTGATCG ATCTTACCTG CCAC GAGGCT GGCTTTCCAC CCAGTGACGA 
961 CGAGGATGAA GAGGGTGAGG AGTTTGTGTT AGATTATGTG GAGCACCCCG GGCACGGTTG 
1021 CAGGTCTTGT CATTATCACC GGAGGAATAC GGGGGACCCA GATATTATGT GTTCGCTTTG 
1081 CTATATGAGG ACCTGTGGCA TGTTTGTCTA CAGTAAGTGA AAATTATGGG CAGTGGGTGA 
1141 TAGAGTGGTG GGTTTGGTGT GGTAATTTTT TTTTTAATTT TTACAGTTTT GTGGTTTAAA 
1201 GAATTTTGTA TTGTGATTTT TTTAAAAGGT CCTGTGTCTG AACCTGAGCC TGAGCCCGAG 
1261 CCAGAACCGG AGCCTGCAAG ACCTACCCGC CGTCCTAAAA TGGCGCCTGC TATCCTGAGA 
1321 CGCCCGACAT CACCTGTGTC TAGAGAATGC AATAGTAGTA CGGATAGCTG TGACTCCGGT 
1381 CCTTCTAACA CACCTCCTGA GATACACCCG GTGGTCCCGC TGTGCGGCAT TAAACCAGTT 
1441 GCCGTGAGAG TTGGTGGGCG TCGCCAGGCT GTGGAATGTA TCGAGGACTT GCTTAACGAG 
1501 CCTGGGCAAC CTTTGGACTT GAGCTGTAAA CGCCCCAGGC CATAAGGTGT AAACCTGTGA 
1561 TTGCGTGTGT GGTTAACGCC TTTGTTTGCT GAATGAGTTG ATGTAAGTTT AATAAAGGGT 
1621 GAGATAATGT TTAACTTGCA TGGCGTGTTA AATGGGGCGG GGCTTAAAGG GTATATAATG 
1681 CGCCGTGGGC TAATCTTGGT TACATCTGAC CTCATGGAGG CTTGGGAGTG TTTGGAAGAT 
1741 TTTTCTGCTG TGCGTAACTT GCTGGAACAG AGCTCTAACA GTACCTCTTG GTTTTGGAGG- 
1801 TTTCTGTGGG GCTCATCCCA GGCAAAGTTA GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 
1861 GAATTTGAAG AGCTTTTGAA ATCCTGTGGT GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 
1921 CAGGCGCTTT TCCAAGAGAA GGTCATCAAG ACTTTGGATT TTTCCACACC GGGGCGCGCT 
1981 GCGGCTGCTG TTGCTTTTTT GAGTTTTATA AAGGATAAAT GGAGCGAAGA AACCCATCTG 
2041 AGCGGGGGGT ACCTGCTGGA TTTTCTGGCC ATGCATCTGT GGAGAGCGGT TGTGAGACAC 
2101 AAGAATCGCC TGCTACTGTT GTCTTCCGTC CGCCCGGCGA TAATACCGAC GGAGGAGCAG 
2161 CAGCAGCAGC AGGAGGAAGC CAGGCGGCGG CGGCAGGAGC AGAGCCCATG GAACCCGAGA 
2221 GCCGGCCTGG ACCCTCGGGA ATGAATGTTG TACAGGTGGC TGAACTGTAT CCAGAACTGA 
2281 GACGCATTTT GACAATTACA GAGGATGGGC AGGGGCTAAA GGGGGTAAAG AGGGAGCGGG 
2341 GGGCTTGTGA GGCTACAGAG GAGGCTAGGA ATCTAGCTTT TAGCTTAATG ACCAGACACC 
2401 GTCCTGAGTG TATTACTTTT CAACAGATCA AGGATAATTG CGCTAATGAG CTTGATCTGC 
2461 TGGCGCAGAA GTATTCCATA GAGCAGCTGA CCACTTACTG GCTGCAGCCA GGGGATGATT 
2521 TTGAGGAGGC TATTAGGGTA TATGCAAAGG TGGCACTTAG GCCAGATTGC AAGTACAAGA 
2581 TCAGCAAACT TGTAAATATC AGGAATTGTT GCTACATTTC TGGGAACGGG GCCGAGGTGG 
2641 AGATAGATAC GGAGGATAGG GTGGCCTTTA GATGTAGCAT GATAAATATG TGGCCGGGGG 
2701 TGCTTGGCAT GGACGGGGTG GTTATTATGA ATGTAAGGTT TACTGGCCCC AATTTTAGCG 
2761 GTACGGTTTT CCTGGCCAAT ACCAACCTTA TCCTACACGG TGTAAGCTTC TATGGGTTTA 
2821 ACAATACCTG TGTGGAAGCC TGGACCGATG TAAGGGTTCG GGGCTGTGCC TTTTACTGCT 
2881 GCTGGAAGGG GGTGGTGTGT CGCCCCAAAA GCAGGGCTTC AATTAAGAAA TGCCTCTTTG 
2941 AAAGGTGTAC CTTGGGTATC CTGTCTGAGG GTAACTCCAG GGTGCGCCAC AATGTGGCCT 
3001 CCGACTGTGG TTGCTTCATG CTAGTGAAAA GCGTGGCTGT GATTAAGCAT AACATGGTAT 
3 061 GTGGCAACTG CGAGGACAGG GCCTCTCAGA TGCTGACCTG CTCGGACGGC AACTGTCACC 
3121 TGCTGAAGAC CATTCACGTA GCCAGCCACT CTCGCAAGGC CTGGCCAGTG TTTGAGCATA 
3181 ACATACTGAC CCGCTGTTCC TTGCATTTGG GTAACAGGAG GGGGGTGTTC CTACCTTACC 
3241 AATGCAATTT GAGTCACACT AAGATATTGC TTGAGCCCGA GAGCATGTCC AAGGTGAAC C 
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3301 TGAACGGGGT GTTTGACATG ACCATGAAGA TCTGGAAGGT GCTGAGGTAC GATGAGACCC 
3361 GCACCAGGTG CAGACCCTGC GAGTGTGGCG GTAAACATAT TAGGAACCAG CCTGTGATGC 
3421 TGGATGTGAC CGAGGAGCTG AGGCCCGATC ACTTGGTGCT GGCCTGCACC CGCGCTGAGT 
3481 TTGGCTCTAG CGATGAAGAT ACAGATTGAG GTACTGAAAT GTGTGGGCGT GGCTTAAGGG 
3541 TGGGAAAGAA TATATAAGGT GGGGGTCTTA TGTAGTTTTG TATCTGTTTT GCAGCAGCCG 
3601 CCGCCGCCAT GAGCACCAAC TCGTTTGATG GAAGCATTGT GAGCTCATAT TTGACAACGC 
3661 GCATGCCCCC ATGGGCCGGG GTGCGTCAGA ATGTGATGGG CTCCAGCATT GATGGTCGCC 
3721 CCGTCCTGCC CGCAAACTCT ACTACCTTGA CCTACGAGAC CGTGTCTGGA ACGCCGTTGG 
3781 AGACTGCAGC CTCCGCCGCC GCTTCAGCCG CTGCAGCCAC CGCCCGCGGG ATTGTGACTG 
3841 ACTTTGCTTT CCTGAGCCCG CTTGCAAGCA GTGCAGCTTC CCGTTCATCC GCCCGCGATG 
3901 ACAAGTTGAC GGCTCTTTTG GCACAATTGG ATTCTTTGAC CCGGGAACTT AATGTCGTTT 
3961 CTCAGCAGCT GTTGGATCTG CGCCAGCAGG TTTCTGCCCT GAAGGCTTCC TCCCCTCCCA 
4021 ATGCGGTTTA AAACATAAAT AAAAAACCAG ACTCTGTTTG GATTTGGATC AAGCAAGTGT 
4081 CTTGCTGTCT TTATTTAGGG GTTTTGCGCG CGCGGTAGGC CCGGGACCAG CGGTCTCGGT 
4141 CGTTGAGGGT CCTGTGTATT TTTTCCAGGA CGTGGTAAAG GTGACTCTGG ATGTTCAGAT 
4201 ACATGGGCAT AAGCCCGTCT CTGGGGTGGA GGTAGCACCA CTGCAGAGCT TCATGCTGCG 
4261 GGGTGGTGTT GTAGATGATC CAGTCGTAGC AGGAGCGCTG GGCGTGGTGC CTAAAAATGT 
4321 CTTTCAGTAG CAAGCTGATT GCCAGGGGCA GGCCCTTGGT GTAAGTGTTT ACAAAGCGGT 
4381 TAAGCTGGGA TGGGTGCATA CGTGGGGATA TGAGATGCAT CTTGGACTGT ATTTTTAGGT 
4441 TGGCTATGTT CCCAGCCATA TCCCTCCGGG GATTCATGTT GTGCAGAACC ACCAGCACAG 
4501 TGTATCCGGT GCACTTGGGA AATTTGTCAT GTAGCTTAGA AGGAAATGCG TGGAAGAACT 
4561 TGGAGACGCC CTTGTGACCT CCAAGATTTT CCATGCATTC GTCCATAATG ATGGCAATGG 
4621 GCCCACGGGC GGCGGCCTGG GCGAAGATAT TTCTGGGATC ACTAACGTCA TAGTTGTGTT 
4681 CCAGGATGAG ATCGTCATAG GCCATTTTTA CAAAGCGCGG GCGGAGGGTG CCAGACTGCG 
4741 GTATAATGGT TCCATCCGGC CCAGGGGCGT AGTTACCCTC ACAGATTTGC ATTTCCCACG 
4801 CTTTGAGTTC AGATGGGGGG ATCATGTCTA CCTGCGGGGC GATGAAGAAA ACGGTTTCCG 
4861 GGGTAGGGGA GATCAGCTGG GAAGAAAGCA GGTTCCTGAG CAGCTGCGAC TTACCGCAGC 
4921 CGGTGGGCCC GTAAATCACA CCTATTACCG GGTGCAACTG GTAGTTAAGA GAGCTGCAGC 
4981 TGCCGTCATC CCTGAGCAGG GGGGCCACTT CGTTAAGCAT GTCCCTGACT CGCATGTTTT . 
5041 CCCTGACCAA ATCCGCCAGA AGGCGCTCGC CGCCCAGCGA TAGCAGTTCT TGCAAGGAAG 
5101 CAAAGTTTTT CAACGGTTTG AGACCGTCCG CCGTAGGCAT GCTTTTGAGC GTTTGACCAA 
5161 GCAGTTCCAG GCGGTCCCAC AGCTCGGTCA CCTGCTCTAC GGCATCTCGA TCCAGCATAT 
5221 CTCCTCGTTT CGCGGGTTGG GGCGGCTTTC GCTGTACGGC AGTAGTCGGT GCTCGTCCAG 
5281 ACGGGCCAGG GTCATGTCTT TCCACGGGCG CAGGGTCCTC GTCAGCGTAG TCTGGGTCAC 
5341 GGTGAAGGGG TGCGCTCCGG GCTGCGCGCT GGCCAGGGTG CGCTTGAGGC TGGTCCTGCT 
5401 G6TGCTGAAG CGCTGCCGGT CTTCGCCCTG CGCGTCGGCC AGGTAGCATT TGACCATGGT 
5461 GTCATAGTCC AGCCCCTCCG CGGCGTGGCC CTTGGCGCGC AGCTTGCCCT TGGAGGAGGC 
5521 GCCGCACGAG GGGCAGTGCA GACTTTTGAG GGCGTAGAGC TTGGGCGCGA GAAATACCGA 
5581 TTCCGGGGAG TAGGCATCCG CGCCGCAGGC CCCGCAGACG GTCTCGCATT CCACGAGCCA 
5641 GGTGAGCTCT GGCCGTTCGG GGTCAAAAAC CAGGTTTCCC CCATGCTTTT TGATGCGTTT 
5701 CTTACCTCTG GTTTCCATGA GCCGGTGTCC ACGCTCGGTG ACGAAAAGGC TGTCCGTGTC 
5761 CCCGTATACA GACTTGAGAG GCCTGTCCTC GAGCGGTGTT CCGCGGTCCT CCTCGTATAG 
5821 AAACTCGGAC CACTCTGAGA CAAAGGCTCG CGTCCAGGCC AGCACGAAGG AGGCTAAGTG 
5881 GGAGGGGTAG CGGTCGTTGT CCACTAGGGG GTCCACTCGC TCCAGGGTGT GAAGACACAT 
5941 GTCGCCCTCT TCGGCATCAA GGAAGGTGAT TGGTTTGTAG GTGTAGGCCA CGTGACCGGG 
6001 TGTTCCTGAA GGGGGGCTAT AAAAGGGGGT GGGGGCGCGT TCGTCCTCAC TCTCTTCCGC 
6061 ATCGCTGTCT GCGAGGGCCA GCTGTTGGGG TGAGTACTCC CTCTGAAAAG CGGGCATGAC 
6121 TTCTGCGCTA AGATTGTCAG TTTCCAAAAA CGAGGAGGAT TTGATATTCA CCTGGCCCGC 
6181 GGTGATGCCT TTGAGGGTGG CCGCATCCAT CTGGTCAGAA AAGACAATCT TTTTGTTGTC 
6241 AAGCTTGGTG GCAAACGACC CGTAGAGGGC GTTGGACAGC AACTTGGCGA TGGAGCGCAG 
6301 GGTTTGGTTT TTGTCGCGAT CGGCGCGCTC CTTGGCCGCG ATGTTTAGCT GCACGTATTC 
6361 GCGCGCAACG CACCGCCATT CGGGAAAGAC GGTGGTGCGC TCGTCGGGCA CCAGGTGCAC 
6421 GCGCCAACCG CGGTTGTGCA GGGTGACAAG GTCAACGCTG GTGGCTACCT CTCCGCGTAG 
6481 GCGCTCGTTG GTCCAGCAGA GGCGGCCGCC CTTGCGCGAG CAGAATGGCG GTAGGGGGTC 
6541 TAGCTGCGTC TCGTCCGGGG GGTCTGCGTC CACGGTAAAG ACCCCGGGCA GCAGGCGCGC 
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6601 GTCGAAGTAG TCTATCTTGC ATCCTTGCAA GTCTAGCGCC TGCTGCCATG CGCGGGCGGC 
6661 AAGCGCGCGC TCGTATGGGT TGAGTGGGGG ACCCCATGGC ATGGGGTGGG TGAGCGCGGA 
6721 GGCGTACATG CCGCAAATGT CGTAAACGTA GAGGGGCTCT CTGAGTATTC CAAGATATGT 
6781 AGGGTAGCAT CTTCCACCGC GGATGCTGGC GCGCACGTAA TCGTATAGTT CGTGCGAGGG 
6841 AGCGAGGAGG TCGGGACCGA GGTTGCTACG GGCGGGCTGC TCTGCTCGGA AGACTATCTG 
6901 CCTGAAGATG GCATGTGAGT TGGATGATAT GGTTGGACGC TGGAAGACGT TGAAGCTGGC 
6961 GTCTGTG AG A CCTACCGCGT CACGCACGAA GGAGGCGTAG GAGTCGCGCA GCTTGTTGAC 
7021 CAGCTCGGCG GTGACCTGCA CGTCTAGGGC GCAGTAGTCC AGGGTTTCCT TGATGATGTC 
7081 ATACTTATCC TGTCCCTTTT TTTTCCACAG CTCGCGGTTG AGGACAAACT CTTCGCGGTC 
7141 TTTCCAGTAC TCTTGGATCG GAAACCCGTC GGCCTCCGAA CGGTAAGAGC CTAGCATGTA 
7201 GAACTGGTTG ACGGCCTGGT AGGCGCAGCA TCCCTTTTCT ACGGGTAGCG CGTATGCCTG 
7261 CGCGGCCTTC CGGAGCGAGG TGTGGGTGAG CGCAAAGGTG TCCCTGACCA TGACTTTGAG . 
7321 GTACTGGTAT TTGAAGTCAG TGTCGTCGCA TCCGCCCTGC TCCCAGAGCA AAAAGTCCGT 
7381 GCGCTTTTTG GAACGCGGAT TTGGCAGGGC GAAGGTGACA TCGTTGAAGA GTATCTTTCC 
7441 CGCGCGAGGC ATAAAGTTGC GTGTGATGCG GAAGGGTCCC GGCACCTCGG AACGGTTGTT 
7501 AATTACCTGG GCGGCGAGCA CGATCTCGTC AAAGCCGTTG ATGTTGTGGC CGACAATGTA 
7561 AAGTTCCAAG AAGCGCGGGA TGCCCTTGAT GGAAGGCAAT TTTTTAAGTT CCTCGTAGGT 
7621 GAGCTCTTCA GGGGAGCTGA GCCCGTGCTC TGAAAGGGCC CAGTCTGCAA GATGAGGGTT 
7681 GGAAGCGACG AATGAGCTCC ACAGGTCACG GGCCATTAGC ATTTGCAGGT GGTCGCGAAA 
7741 GGTCCTAAAC TGGCGACCTA TGGCCATTTT TTCTGGGGTG ATGCAGTAGA AGGTAAGCGG 
7801 GTCTTGTTCC CAGCGGTCCC ATCCAAGGTT CGCGGCTAGG TCTCGCGCGG CAGTCACTAG 
7861 AGGCTCATCT CCGCCGAACT TCATGACCAG CATGAAGGGC ACGAGCTGCT TCCCAAAGGC 
7921 CCCCATCCAA GTATAGGTCT CTACATCGTA GGTGACAAAG AGACGCTCGG TGCGAGGATG 
7981 CGAGCCGATC GGGAAGAACT GGATCTCCCG CCACCAATTG GAGGAGTGGC TATTGATGTG 
8041 GTGAAAGTAG AAGTCCCTGC GACGGGCCGA ACACTCGTGC TGGCTTTTGT AAAAACGTGC 
8101 GCAGTACTGG CAGCGGTGCA CGGGCTGTAC ATCCTGCACG AGGTTGACCT GACGACCGCG 
8161 CACAAGGAAG CAGAGTGGGA ATTTGAGCCC CTCGCCTGGC GGGTTTGGCT GGTGGTCTTC 
8221 TACTTCGGCT GCTTGTCCTT GACCGTCTGG CTGCTCGAGG GGAGTTACGG TGGATCGGAC 
8281 CACCACGCCG CGCGAGCCCA AAGTCCAGAT GTCCGCGCGC GGCGGTCGGA GCTTGATGAC 
8341 AACATCGCGC AGATGGGAGC TGTCCATGGT CTGGAGCTCC CGCGGCGTCA GGTCAGGCGG 
8401 GAGCTCCTGC AGGTTTACCT CGCATAGACG GGTCAGGGCG CGGGCTAGAT CCAGGTGATA 
8461 CCTAATTTCC AGGGGCTGGT TGGTGGCGGC GTCGATGGCT TGCAAGAGGC CGCATCCCCG 
8521 CGGCGCGACT ACGGTACCGC GCGGCGGGCG GTGGGCCGCG GGGGTGTCCT TGGATGATGC 
8581 ATCTAAAAGC GGTGACGCGG GCGAGCCCCC GGAGGTAGGG GGGGCTCCGG ACCCGCCGGG 
8641 AGAGGGGGCA GGGGCACGTC GGCGCCGCGC GCGGGCAGGA GCTGGTGCTG CGCGCGTAGG 
8701 TTGCTGGCGA ACGC GACGAC GCGGCGGTTG ATCTCCTGAA TCTGGCGCCT CTGCGTGAAG 
8761 ACGACGGGCC CGGTGAGCTT GAGCCTGAAA GAGAGTTCGA CAGAATCAAT TTCGGTGTCG 
8821 TTGACGGCGG CCTGGCGCAA AATCTCCTGC ACGTCTCCTG AGTTGTCTTG ATAGGCGATC 
8881 TCGGCCATGA ACTGCTCGAT CTCTTCCTCC TGGAGATCTC CGCGTCCGGC TCGCTCCACG 
8941 GTGGCGGCGA GGTCGTTGGA AATGCGGGCC ATGAGCTGCG AGAAGGCGTT GAGGCCTCCG 
9001 TCGTTCCAGA CGCGGCTGTA GACCACGCCC CCTTCGGCAT CGCGGGCGCG CATGACCACC 
9061 TGCGCGAGAT TGAGCTCCAC GTGCCGGGCG AAGACGGCGT AGTTTCGCAG GCGCTGAAAG 
9121 AGGTAGTTGA GGGTGGTGGC GGTGTGTTCT GCCACGAAGA AGTACATAAC CCAGCGTCGC 
9181 AACGTGGATT CGTTGATATC CCCCAAGGCC TCAAGGCGCT CCATGGCCTC GTAGAAGTCC 
9241 AC GGCGAAGT TGAAAAACTG GGAGTTGCGC GCCGACACGG TTAACTCCTC CTCCAGAAGA 
9301 CGGATGAGCT CGGCGACAGT GTCGCGCACC TCGCGCTCAA AGGCTACAGG GGCCTCTTCT 
9361 TCTTCTTCAA TCTCCTCTTC CATAAGGGCC TCCCCTTCTT CTTCTTCTGG CGGCGGTGGG 
9421 GGAGGGGGGA CACGGCGGCG ACGACGGCGC ACCGGGAGGC GGTCGACAAA GCGCTCGATC 
9481 ATCTCCCCGC GGCGACGGCG CATGGTCTCG GTGACGGCGC GGCCGTTCTC GCGGGGGCGC 
9541 AGTTGGAAGA CGCCGCCCGT CATGTCCCGG TTATGGGTTG GCGGGGGGCT GCCATGCGGC 
9601 AGGGATACGG CGCTAACGAT GCATCTCAAC AATTGTTGTG TAGGTACTCC GCCGCCGAGG 
9661 GACCTGAGCG AGTCCGCATC GACCGGATCG GAAAACCTCT CGAGAAAGGC GTCTAACCAG 
9721 TCACAGTCGC AAGGTAGGCT GAGCACCGTG GCGGGCGGCA GCGGGCGGCG GTCGGGGTTG 
9781 TTTCTGGCGG AGGTGCTGCT GATGATGTAA TTAAAGTAGG CGGTCTTGAG ACGGCGGATG 
9841 GTCGACAGAA GCACCATGTC CTTGGGTCCG GCCTGCTGAA TGCGCAGGCG GTCGGCCATG 
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9901 CCCCAGGCTT CGTTTTGACA TCGGCGCAGG TCTTTGTAGT AGTCTTGCAT GAGCCTTTCT 
9961 ACCGGCACTT CTTCTTCTCC TTCCTCTTGT CCTGCATCTC TTGCATCTAT CGCTGCGGCG 
10021 GCGGCGGAGT TTGGCCGTAG GTGGCGCCCT CTTCCTCCCA TGCGTGTGAC CCCGAAGCCC 
10081 CTCATCGGCT GAAGCAGGGC TAGGTCGGCG ACAACGCGCT CGGCTAATAT GGCCTGCTGC 
10141 ACCTGCGTGA GGGTAGACTG GAAGTCATCC ATGTCCACAA AGCGGTGGTA TGCGCCCGTG 
10201 TTGATGGTGT AAGTGCAGTT GGCCATAACG GACCAGTTAA CGGTCTGGTG ACCCGGCTGC 
10261 GAGAGCTCGG TGTACCTGAG ACGCGAGTAA GCCCTCGAGT CAAATACGTA GTCGTTGCAA 
10321 GTCCGCACCA GGTACTGGTA TCCCACCAAA AAGTGCGGCG GCGGCTGGCG GTAGAGGGGC 
10381 CAGCGTAGGG TGGCCGGGGC TCCGGGGGCG AGATCTTCCA ACATAAGGCG ATGATATCCG 
10441 TAGATGTACC TGGACATCCA GGTGATGCCG GCGGCGGTGG TGGAGGCGCG CGGAAAGTCG 
10501 CGGACGCGGT TCCAGATGTT GCGCAGCGGC AAAAAGTGCT CCATGGTCGG GACGCTCTGG 
10561 CCGGTCAGGC GCGCGCAATC GTTGACGCTC TAGACCGTGC AAAAGGAGAG CCTGTAAGCG 
10621 GGCACTCTTC CGTGGTCTGG TGGATAAATT CGCAAGGGTA TCATGGCGGA CGACCGGGGT 
10681 TCGAGCCCCG TATCCGGCCG TCCGCCGTGA TCCATGCGGT TACCGCCCGC GTGTCGAACC 
10741 CAGGTGTGCG AC GTC AG AC A ACGGGGGAGT GCTCCTTTTG GCTTCCTTCC AGGCGCGGCG 
10801 GCTGCTGCGC TAGCTTTTTT GGCCACTGGC CGCGCGCAGC GTAAGCGGTT AGGCTGGAAA 
10861 GCGAAAGCAT TAAGTGGCTC GCTCCCTGTA GCCGGAGGGT TATTTTCCAA GGGTTGAGTC 
10921 GCGGGACCCC CGGTTCGAGT CTCGGACCGG CCGGACTGCG GCGAACGGGG GTTTGCCTCC 
10981 CCGTCATGCA AGACCCCGCT TGCAAATTCC TCCGGAAACA GGGACGAGCC CCTTTTTTGC 
11041 TTTTCCCAGA TGCATCCGGT GCTGCGGCAG ATGCGCCCCC CTCCTCAGCA GCGGCAAGAG 
11101 CAAGAGCAGC GGCAGACATG CAGGGCACCC TCCCCTCCTC CTACCGCGTC AGGAGGGGCG 
11161 ACATCCGCGG TTGACGCGGC AGCAGATGGT GATTACGAAC CCCCGCGGCG CCGGGCCCGG 
11221 CACTACCTGG ACTTGGAGGA GGGCGAGGGC CTGGCGCGGC TAGGAGCGCC CTCTCCTGAG 
11281 CGGTACCCAA GGGTGCAGCT GAAGCGTGAT ACGCGTGAGG CGTACGTGCC GCGGCAGAAC 
11341 CTGTTTCGCG ACCGCGAGGG AGAGGAGCCC GAGGAGATGC GGGATCGAAA GTTCCACGCA 
11401 GGGCGCGAGC TGCGGCATGG CCTGAATCGC GAGCGGTTGC TGCGCGAGGA GG AC TTTGAG 
11461 CCCGACGCGC GAACCGGGAT TAGTCCCGCG CGCGCACACG TGGCGGCCGC CGACCTGGTA 
11521 ACCGCATACG AGCAGACGGT GAACCAGGAG ATTAACTTTC AAAAAAGCTT TAACAACCAC 
11581 GTGCGTACGC TTGTGGCGCG CGAGGAGGTG GCTATAGGAC TGATGCATCT GTGGGACTTT 
11641 GTAAGCGCGC TGGAGCAAAA CCCAAATAGC AAGCCGCTCA TGGCGCAGCT GTTCCTTATA 
11701 GTGCAGCACA GCAGGGACAA CGAGGCATTC AGGGATGCGC TGCTAAACAT AGTAGAGCCC 
11761 GAGGGCCGCT GGCTGCTCGA TTTGATAAAC ATCCTGCAGA GCATAGTGGT GCAGGAGCGC 
11821 AGCTTGAGCC TGGCTGACAA GGTGGCCGCC ATCAACTATT CCATGCTTAG CCTGGGCAAG 
11881 TTTTACGCCC GCAAGATATA CCATACCCCT TACGTTCCCA TAGACAAGGA GGTAAAGATC 
11941 GAGGGGTTCT ACATGCGCAT GGCGCTGAAG GTGCTTACCT TGAGCGACGA CCTGGGCGTT 
12001 TATCGCAACG AGCGCATCCA CAAGGCCGTG AGCGTGAGCC GGCGGCGCGA GCTCAGCGAC 
12061 CGCGAGCTGA TGCACAGCCT GCAAAGGGCC CTGGCTGGCA CGGGCAGCGG CGATAGAGAG 
12121 GCCGAGTCCT ACTTTGACGC GGGCGCTGAC CTGCGCTGGG CCCCAAGCCG ACGCGCCCTG 
12181 GAGGCAGCTG GGGCCGGACC TGGGCTGGCG GTGGCACCCG CGCGCGCTGG CAACCTCGGC 
12241 GGCGTGGAGG AATATGACGA GGACGATGAG TACGAGCCAG AGGACGGCGA GTACTAAGCG 
12301 GTGATGTTTC TGATCAGATG ATGCAAGACG CAACGGACCC GGCGGTGC GG GCGGCGCTGC 
123 61 AGAGCCAGCC GTCCGGCCTT AACTCCACGG ACGACTGGCG CCAGGTCATG GACCGCATCA 
12421 TGTCGCTGAC TGCGCGCAAT CCTGACGCGT TCCGGCAGCA GCCGCAGGCC AACCGGCTCT 
12481 CCGCAATTCT GGAAGCGGTG GTCCCGGCGC GCGCAAACCC CACGCACGAG AAGGTGCTGG 
12541 CGATCGTAAA CGCGCTGGCC GAAAACAGGG CCATCCGGCC CGACGAGGCC GGCCTGGTCT 
12601 ACGACGCGCT GCTTCAGCGC GTGGCTCGTT ACAACAGCGG CAACGTGCAG ACCAACCTGG 
12661 ACCGGCTGGT GGGGGATGTG CGCGAGGCCG TGGCGCAGCG TGAGCGCGCG CAGCAGCAGG 
12721 GCAACCTGGG CTCCATGGTT GCACTAAACG CCTTCCTGAG TACACAGCCC GCCAACGTGC 
12781 CGCGGGGACA GGAGGACTAC ACCAACTTTG TGAGCGCACT GCGGCTAATG GTGACTGAGA 
12841 CACCGCAAAG TGAGGTGTAC CAGTCTGGGC CAGACTATTT TTTCCAGACC AGTAGACAAG 
12901 GCCTGCAGAC CGTAAACCTG AGCCAGGCTT TCAAAAACTT GCAGGGGCTG TGGGGGGTGC 
12961 GGGCTCCCAC AGGCGACCGC GCGACCGTGT CTAGCTTGCT GACGCCCAAC TCGCGCCTGT 
13021 TGCTGCTGCT AATAGCGCCC TTCACGGACA GTGGCAGCGT GTCCCGGGAC ACATACCTAG 
13081 GTCACTTGCT GACACTGTAC CGCGAGGCCA TAGGTCAGGC GCATGTGGAC GAGCATACTT 
13141 TCCAGGAGAT TACAAGTGTC AGCCGCGCGC TGGGGCAGGA GGACACGGGC AGCCTGGAGG 
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13201 CAACCCTAAA CTACCTGCTG ACCAACCGGC GGCAGAAGAT CCCCTCGTTG CACAGTTTAA. 
13261 ACAGCGAGGA GGAGCGCATT TTGCGCTACG TGCAGCAGAG CGTGAGCCTT AACCTGATGC 
13321 GCGACGGGGT AACGCCCAGC GTGGCGCTGG ACATGACCGC GCGCAACATG GAACCGGGCA 
13381 TGTATGCCTC AAACCGGCCG TTTATCAACC GCCTAATGGA CTACTTGCAT CGCGCGGCCG 
13441 CCGTGAACCC CGAGTATTTC ACCAATGCCA TCTTGAACCC GCACTGGCTA CCGCCCCCTG 
13501 GTTTCTACAC CGGGGGATTC GAGGTGCCCG AGGGTAACGA TGGATTCCTC TGGGACGACA 
13561 TAGACGACAG CGTGTTTTCC CCGCAACCGC AGACCCTGCT AGAGTTGCAA CAGCGCGAGC 
13621 AGGCAGAGGC GGCGCTGCGA AAGGAAAGCT TCCGCAGGCC AAGCAGCTTG TCCGATCTAG 
13681 GCGCTGCGGC CCCGCGGTCA GATGCTAGTA GCCCATTTCC AAGCTTGATA GGGTCTCTTA; 
13741 CCAGCACTCG CACCACCCGC CCGCGCCTGC TGGGCGAGGA GGAGTACCTA AACAACTCGC 
13801 TGCTGCAGCC GCAGCGCGAA AAAAACCTGC CTCCGGCATT TCCCAACAAC GGGATAGAGA 
13861 GCCTAGTGGA CAAGATGAGT AGATGGAAGA CGTACGCGCA GGAGCACAGG GACGTGCCAG 
13921 GCCCGCGCCC GCCCACCCGT CGTCAAAGGC ACGACCGTCA GCGGGGTCTG GTGTGGGAGG > 
13 981 ACGATGACTC GGCAGACGAC AGCAGCGTCC TGGATTTGGG AGGGAGTGGC AACCCGTTTG 
14041 CGCACCTTCG CCCCAGGCTG GGGAGAATGT TTTAAAAAAA AAAAAGCATG ATGCAAAATA 
14101 AAAAACTCAC CAAGGCCATG GCACCGAGCG TTGGTTTTCT TGTATTCCCC TTAGTATGCG 
14161 GCGCGCGGCG ATGTATGAGG AAGGTCCTCC TCCCTCCTAC GAGAGTGTGG TGAGCGCGGC 
14221 GCCAGTGGCG GCGGCGCTGG GTTCTCCCTT CGATGCTCCC CTGGACCCGC CGTTTGTGCC 
14281 TCCGCGGTAC CTGCGGCCTA CCGGGGGGAG AAACAGCATC CGTTACTCTG AGTTGGCACC 
14341 CCTATTCGAC ACCACCCGTG TGTACCTGGT GGACAACAAG TCAACGGATG TGGCATCCCT 
14401 GAACTACCAG AACGACCACA GCAACTTTCT GACCACGGTC ATTCAAAACA ATGACTACAG 
14461 CCCGGGGGAG GCAAGCACAC AGACCATCAA TCTTGACGAC CGGTCGCACT GGGGCGGCGA 
14521 CCTGAAAACC ATCCTGCATA CCAACATGCC AAATGTGAAC GAGTTCATGT TTACCAATAA 
14581 GTTTAAGGCG CGGGTGATGG TGTCGCGCTT GCCTACTAAG GACAATCAGG TGGAGCTGAA 
14641 ATACGAGTGG GTGGAGTTCA CGCTGCCCGA GGGCAACTAC TCCGAGACCA TGACCATAGA 
14701 CCTTATGAAC AACGCGATCG TGGAGCACTA CTTGAAAGTG GGCAGACAGA ACGGGGTTCT 
14761 GGAAAGCGAC ATCGGGGTAA AGTTTGACAC CCGCAACTTC AGACTGGGGT TTGACCCCGT 
14821 CACTGGTCTT GTCATGCCTG GGGTATATAC AAACGAAGCC TTCCATCCAG ACATCATTTT 
14881 GCTGCCAGGA TGCGGGGTGG ACTTCACCCA CAGCCGCCTG AGCAACTTGT TGGGCATCCG 
14941 CAAGCGGCAA CCCTTCCAGG AGGGCTTTAG GATCACCTAC GATGATCTGG AGGGTGGTAA 
15001 CATTCCCGCA CTGTTGGATG TGGACGCCTA CCAGGCGAGC TTGAAAGATG ACACCGAACA. 
15061 GGGCGGGGGT GGCGCAGGCG GCAGCAACAG CAGTGGCAGC GGCGCGGAAG AGAACTCCAA 
15121 CGCGGCAGCC GCGGCAATGC AGCCGGTGGA GGACATGAAC GATCATGCCA TTCGCGGCGA 
15181 CACCTTTGCC ACACGGGCTG AGGAGAAGCG CGCTGAGGCC GAAGCAGCGG CCGAAGCTGC 
15241 CGCCCCCGCT GCGCAACCCG AGGTCGAGAA GCCTCAGAAG AAACCGGTGA TCAAACCCCT 
15301 GACAGAGGAC AGCAAGAAAC GCAGTTACAA CCTAATAAGC AATGACAGCA CCTTCACCCA 
15361 GTACCGCAGC TGGTACCTTG CATACAACTA CGGCGACCCT CAGACCGGAA TCCGCTCATG 
15421 GACCCTGCTT TGCACTCCTG ACGTAACCTG CGGCTCGGAG CAGGTCTACT GGTCGTTGCC 
15481 AGACATGATG CAAGACCCCG TGACCTTCCG CTCCACGCGC CAGATCAGCA ACTTTCCGGT 
15541 GGTGGGCGCC GAGCTGTTGC CCGTGCACTC CAAGAGCTTC TACAACGACC AGGCCGTCTA 
15601 CTCCCAACTC ATCCGCCAGT TTACCTCTCT GACCCACGTG TTCAATCGCT TTCCCGAGAA 
15661 CCAGATTTTG GCGCGCCCGC CAGCCCCCAC CATCACCACC GTCAGTGAAA ACGTTCCTGC 
15721 TCTCACAGAT CACGGGACGC TACCGCTGCG CAACAGCATC GGAGGAGTCC AGCGAGTGAC 
15781 CATTACTGAC GCCAGACGCC GCACCTGCCC CTACGTTTAC AAGGCCCTGG GCATAGTCTC 
15841 GCCGCGCGTC CTATCGAGCC GCACTTTTTG AGCAAGCATG TCCATCCTTA TATCGCCCAG 
15901 CAATAACACA GGCTGGGGCC TGCGCTTCCC AAGCAAGATG TTTGGCGGGG CCAAGAAGCG 
15961 CTCCGACCAA CACCCAGTGC GCGTGCGCGG GCACTACCGC GCGCCCTGGG GCGCGCACAA 
16021 ACGCGGCCGC ACTGGGCGCA CCACCGTCGA TGACGCCATC GACGCGGTGG TGGAGGAGGC 
16081 GGGCAACTAC ACGCCCACGC CGCCACCAGT GTCCACAGTG GACGCGGCCA TTCAGACCGT 
16141 GGTGCGCGGA GCCCGGCGCT ATGCTAAAAT GAAGAG AC GG CGGAGGCGCG TAGCACGTCG 
16201 CCACCGCCGC CGACCCGGCA CTGCCGCCCA ACGCGCGGCG GCGGCCCTGC TTAACCGCGC 
16261 ACGTCGCACC GGCCGACGGG CGGCCATGCG GGCCGCTCGA AGGCTGGCCG CGGGTATTGT 
16321 CACTGTGCCC CCCAGGTCCA GGCGACGAGC GGCCGCCGCA GCAGCCGCGG CCATTAGTGC 
16381 TATGACTCAG GGTCGCAGGG GCAACGTGTA TTGGGTGCGC GACTCGGTTA GCGGCCTGCG 
16441 CGTGCCCGTG CGCACCCGCC CCCCGCGCAA CTAGATTGCA AGAAAAAACT ACTTAGACTC 
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16501 GTACTGTTGT ATGTATCCAG CGGCGGCGGC GCGCAACGAA GCTATGTCCA AGCGCAAAAT 
16561 CAAAGAAGAG ATGCTCCAGG TCATCGCGCC GGAGATCTAT GGCCCCCCGA AGAAGGAAGA 
16621 GCAGGATTAC AAGCCCCGAA AGCTAAAGCG GGTCAAAAAG AAAAAGAAAG ATGATGATGA 
16681 TGAACTTGAC GACGAGGTGG AACTGCTGCA CGCTACCGCG CCCAGGCGAC GGGTACAGTG 
16741 GAAAGGTCGA CGCGTAAAAC GTGTTTTGCG ACCCGGCACC ACCGTAGTCT TTACGCCCGG 
16801 TGAGCGCTCC ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG ACGAGGACCT 
16861 GCTTGAGCAG GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 
16921 GCTGGCGTTG CCGCTGGACG AGGGCAACCC AACACCTAGC CTAAAGCCCG TAACACTGCA 
16981 GCAGGTGCTG CCCGCGCTTG CACCGTCCGA AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 
17041 TGACTTGGCA CCCACCGTGC AGCTGATGGT ACCCAAGCGC CAGCGACTGG AAGATGTCTT 
17101 GGAAAAAATG ACCGTGGAAC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 
17161 GGTGGCGCCG GGACTGGGCG TGCAGACCGT GGACGTTCAG ATACCCACTA CCAGTAGCAC 
17221 CAGTATTGCC ACCGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTG CCTCAGCGGT 
17281 GGCGGATGCC GCGGTGCAGG CGGTCGCTGC GGCCGCGTCC AAGACCTCTA CGGAGGTGCA 
17341 AACGGACCCG TGGATGTTTC GCGTTTCAGC CCCCCGGCGC CCGCGCGGTT CGAGGAAGTA 
17401 CGGCGCCGCC AGCGCGCTAC TGCCCGAATA TGCCCTACAT CCTTCCATTG CGCCTACCCC 
17461 CGGCTATCGT GGCTACACCT ACCGCCCCAG AAGACGAGCA ACTACCCGAC GCCGAACCAC 
17521 CACTGGAACC CGCCGCCGCC GTCGCCGTCG CCAGCCCGTG CTGGCCCCGA TTTCCGTGCG 
17581 CAGGGTGGCT CGCGAAGGAG GCAGGACCCT GGTGCTGCCA ACAGCGCGCT ACCACCCCAG 
17641 CATCGTTTAA AAGCCGGTCT TTGTGGTTCT TGCAGATATG GCCCTCACCT GCCGCCTCCG 
17701 TTTCCCGGTG CCGGGATTCC GAGGAAGAAT GCACCGTAGG AGGGGCATGG CCGGCCACGG 
17761 CCTGACGGGC GGCATGCGTC GTGCGCACCA CCGGCGGCGG CGCGCGTCGC ACCGTCGCAT 
17821 GCGCGGCGGT ATCCTGCCCC TCCTTATTCC ACTGATCGCC GCGGCGATTG GCGCCGTGCC 
17881 CGGAATTGCA TCCGTGGCCT TGCAGGCGCA GAGACACTGA TTAAAAACAA GTTGCATGTG 
17941 GAAAAATCAA AATAAAAAGT CTGGACTCTC ACGCTCGCTT GGTCCTGTAA CTATTTTGTA 
18001 GAATGGAAGA CATCAACTTT GCGTCTCTGG CCCCGCGACA CGGCTCGCGC CCGTTCATGG 
18061 GAAACTGGCA AGATATCGGC ACCAGCAATA TGAGCGGTGG CGCCTTCAGC TGGGGCTCGC 
18121 TGTGGAGCGG CATTAAAAAT TTCGGTTCCA CCGTTAAGAA CTATGGCAGC AAGGCCTGGA 
18181 ACAGCAGCAC AGGCCAGATG CTGAGGGATA AGTTGAAAGA GCAAAATTTC CAACAAAAGG 
18241 TGGTAGATGG CCTGGCCTCT GGCATTAGCG GGGTGGTGGA CCTGGCCAAC CAGGCAGTGC 
18301 AAAATAAGAT TAACAGTAAG CTTGATCCCC GCCCTCCCGT AGAGGAGCCT CCACCGGCCG 
18361 TGGAGACAGT GTCTCCAGAG GGGCGTGGCG AAAAGCGTCC GCGCCCCGAC AGGGAAGAAA 
18421 CTCTGGTGAC GCAAATAGAC GAGCCTCCCT CGTACGAGGA GGCACTAAAG CAAGGCCTGC 
18481 CCACCACCCG TCCCATCGCG CCCATGGCTA CCGGAGTGCT GGCSCCAGCAC ACACCCGTAA 
18541 CGCTGGACCT GCCTCCCCCC GCCGACACCC AGCAGAAACC TGTGCTGCCA GGCCCGACCG 
18601 CCGTTGTTGT AACCCGTCCT AGCCGCGCGT CCCTGCGCCG CGCCGCCAGC GGTCCGCGAT 
18661 CGTTGCGGCC CGTAGCCAGT GGCAACTGGC AAAGCACACT GAACAGCATC GTGGGTCTGG 
18721 GGGTGCAATC CCTGAAGCGC CGACGATGCT TCTGAATAGC TAACGTGTCG TATGTGTGTC 
18781 ATGTATGCGT CCATGTCGCC GCCAGAGGAG CTGCTGAGCC GCCGCGCGCC CGCTTTCCAA 
18841 GATGGCTACC CCTTCGATGA TGCCGCAGTG GTCTTACATG CACATCTCGG GCCAGGACGC 
18901 CTCGGAGTAC CTGAGCCCCG GGCTGGTGCA GTTTGCCCGC GCCACCGAGA CGTACTTCAG 
18961 CCTGAATAAC AAGTTTAGAA ACCCCACGGT GGCGCCTACG CACGACGTGA CCACAGACCG 
19021 GTCCCAGCGT TTGACGCTGC GGTTCATCCC TGTGGACCGT GAGGATACTG CGTACTCGTA 
19081 CAAGGCGCGG TTCACCCTAG CTGTGGGTGA TAACCGTGTG CTGGACATGG CTTCCACGTA 
19141 CTTTGACATC CGCGGCGTGC TGGACAGGGG CCCTACTTTT AAGCCCTACT CTGGCACTGC 
19201 CTACAACGCC CTGGCTCCCA AGGGTGCCCC AAATCCTTGC GAATGGGATG AAGCTGCTAC 
19261 TGCTCTTGAA ATAAACCTAG AAGAAGAGGA CGATGACAAC GAAGACGAAG TAGACGAGCA 
19321 AGCTGAGCAG CAAAAAACTC ACGTATTTGG GCAGGCGCCT TATTCTGGTA TAAATATTAC 
19381 AAAGGAGGGT ATTCAAATAG GTGTCGAAGG TCAAACACCT AAATATGCCG ATAAAACATT 
19441 TCAACCTGAA CCTCAAATAG GAGAATCTCA GTGGTACGAA ACTGAAATTA ATCATGCAGC 
19501 TGGGAGAGTC CTTAAAAAGA CTACCCCAAT GAAACCATGT TACGGTTCAT ATGCAAAACC 
19561 CACAAATGAA AATGGAGGGC AAGGCATTCT TGTAAAGCAA CAAAATGGAA AGCTAGAAAG 
19621 TCAAGTGGAA ATGCAATTTT TCTCAACTAC TGAGGCGACC GCAGGCAATG GTGATAACTT 
19681 GACTCCTAAA GTGGTATTGT ACAGTGAAGA TGTAGATATA GAAACCCCAG ACACTCATAT 
19741 TTCTTACATG CCCACTATTA AGGAAGGTAA CTCACGAGAA CTAATGGGCC AACAATCTAT 
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19801 GCCCAACAGG CCTAATTACA TTGCTTTTAG GGACAATTTT ATTGGTCTAA TGTATTACAA 
19861 CAGCACGGGT AATATGGGTG TTCTGGCGGG CCAAGCATCG CAGTTGAATG CTGTTGTAGA: 
19921 TTTGCAAGAC AGAAACACAG AGCTTTCATA CCAGCTTTTG CTTGATTCCA TTGGTGATAG 
19981 AACCAGGTAC TTTTCTATGT GGAATCAGGC TGTTGACAGC TATGATCCAG ATGTTAGAAT 
20041 TATTGAAAAT CATGGAACTG AAGATGAACT TCCAAATTAC TGCTTTCCAC TGGGAGGTGT 
20101 GATTAATACA GAGACTCTTA CCAAGGTAAA ACCTAAAACA GGTCAGGAAA ATGGATGGGA 
20161 AAAAGATGCT ACAGAATTTT CAGATAAAAA TGAAATAAGA GTTGGAAATA ATTTTGCCAT 
20221 GGAAATCAAT CTAAATGCCA ACCTGTGGAG AAATTTCCTG TACTCCAACA TAGCGCTGTA 
20281 TTTGCCCGAC AAGCTAAAGT ACAGTCCTTC CAACGTAAAA ATTTCTGATA ACCCAAACAC 

. 20341 CTACGACTAC ATGAACAAGC GAGTGGTGGC TCCCGGGTTA GTGGACTGCT ACATTAACCT 
20401 TGGAGCACGC TGGTCCCTTG ACTATATGGA CAACGTCAAC CCATTTAACC ACCACCGCAA 
20461 TGCTGGCCTG CGCTACCGCT CAATGTTGCT GGGCAATGGT CGCTATGTGC CCTTCCACAT 
20521 CCAGGTGCCT CAGAAGTTCT TTGCCATTAA AAACCTCCTT CTCCTGCCGG GCTCATACAC 
20581 CTACGAGTGG AACTTCAGGA AGGATGTTAA CATGGTTCTG CAGAGCTCCC TAGGAAATGA 

. 20641 CCTAAGGGTT GACGGAGCCA GCATTAAGTT TGATAGCATT TGCCTTTACG CCACCTTCTT 
20701 CCCCATGGCC CACAACACCG CCTCCACGCT TGAGGCCATG CTTAGAAACG ACACCAACGA 
20761 CCAGTCCTTT AACGACTATC TCTCCGCCGC CAACATGCTC TACCCTATAC CCGCCAACGC 
20821 TACCAACGTG CCCATATCCA TCCCCTCCCG CAACTGGGCG GCTTTCCGCG GCTGGGCCTT 
20881 CACGCGCCTT AAGACTAAGG AAACCCCATC ACTGGGCTCG GGCTACGACC CTTATTACAC 
20941 CTACTCTGGC TCTATACCCT ACCTAGATGG AACCTTTTAC CTCAACCACA CCTTTAAGAA 
21001 GGTGGCCATT ACCTTTGACT CTTCTGTCAG CTGGCCTGGC AATGACCGCC TGCTTACCCC 
21061 CAACGAGTTT GAAATTAAGC GCTCAGTTGA CGGGGAGGGT TACAACGTTG CCCAGTGTAA 
21121 CATGACCAAA GACTGGTTCC TGGTACAAAT GCTAGCTAAC TACAACATTG GCTACCAGGG 
21181 CTTCTATATC CCAGAGAGCT ACAAGGACCG CATGTACTGC TTCTTTAGAA ACTTGCAGCC 
21241 CATGAGCCGT CAGGTGGTGG ATGATACTAA ATACAAGGAC TACCAACAGG TGGGCATCCT 
21301 ACACCAACAC AACAACTCTG GATTTGTTGG CTACCTTGCC CCCACCATGC GCGAAGGACA 
21361 GGCCTACCCT GCTAACTTCC CCTATCCGCT TATAGGCAAG ACCGCAGTTG ACAGCATTAC 
21421 CCAGAAAAAG TTTCTTTGCG ATCGCACCCT TTGGCGCATC CCATTCTCCA GTAACTTTAT 
21481 GTCCATGGGC GCACTCACAG ACCTGGGCCA AAACCTTCTC TACGCCAACT CCGCCCACGC 
21541 GCTAGACATG ACTTTTGAGG TGGATCCCAT GGACGAGCCC ACCCTTCTTT ATGTTTTGTT 
21601 TGAAGTCTTT GACGTGGTCC GTGTGCACCG GCCGCACCGC GGCGTCATCG AAACCGTGTA 
21661 CCTGCGCACG CCCTTCTCGG CCGGCAACGC CACAACATAA AGAAGCAAGG AACATCAACA 
21721 ACAGCTGCCG CCATGGGCTC CAGTGAGCAG GAACTGAAAG CCATTGTCAA AGATCTTGGT 
21781 TGTGGGCCAT ATTTTTTGGG CACCTATGAC AAGCGCTTTC CAGGCTTTGT TTCTCCACAC 
21841 AAGCTCGCCT GCGCGATAGT CAATACGGCC GGTCGCGAGA CTGGGGGCGT ACACTGGATG 
21901 GCCTTTGCGT GGAACCCGCA CTCAAAAACA TGCTACCTCT TTGAGCCCTT TGGCTTTTCT 
21961 GACCAGCGAC TCAAGCAGGT TTACCAGTTT GAGTACGAGT CACTCCTGCG CCGTAGCGCC 
22021 ATTGCTTCTT CCCCCGACCG CTGTATAACG CTGGAAAAGT CCACCCAAAG CGTACAGGGG 
22081 CCCAACTCGG CCGCCTGTGG AGTATTCTGC TGCATGTTTC TCCACGCCTT TGCCAACTGG 
22141 CCCCAAACTC CCATGGATCA CAACCCCACC ATGAACCTTA TTACCGGGGT ACCCAACTCC 
22201 ATGCTCAACA GTCCCCAGGT ACAGCCCACC CTGCGTCGCA ACCAGGAACA GCTCTACAGC 
22261 TTCCTGGAGC GCCACTCGCC CTACTTCCGC AGCCACAGTG CGCAGATTAG GAGCGCCACi 1 
22321 TCTTTTTGTC ACTTGAAAAA CATGTAAAAA TAATGT AC T A GAGACACTTT CAATAAAGGG 
22381 AAATGCTTTT ATTTGTACAC TCTCGGGTGA TTATTTACCC CCACCCTTGC CGTCTGCGCC 
22441 GTTTAAAAAT CAAAGGGGTT CTGCCGCGCA TCGCTATGCG CCACTGGCAG GGACACGTTG 
22501 CGATACTGGT GTTTAGTGCT CCACTTA71AC TCAGGCACAA CCATCCGCGG CAGCTCGGTG 
22561 AAGTTTTCAC TCCACAGGCT GCGCACCATC ACCAACGCGT TTAGCAGGTC GGGCGCCGAT 
22621 ATCTTGAAGT CGCAGTTGGG GCCTCCGCCC TGCGCGCGCG AGTTGC GAT A CACAGGGTTG 
22681 CAGCACTGGA ACACTATCAG CGCCGGGTGG TGCACGCTGG CCAGCACGCT CTTGTCGGAG 
22741 ATCAGATCCG CGTCCAGGTC CTCCGCGTTG CTCAGGGCGA ACGGAGTCAA CTTTGGTAGC 
22801 TGCCTTCCCA AAAAGGGCGC GTGCCCAGGC TTTGAGTTGC ACTCGCACCG TAGTGGCATC 
22861 AAAAGGTGAC CGTGCCCGGT CTGGGCGTTA GGATACAGCG CCTGCATAAA AGCCTTGATC 
22921 TGCTTAAAAG CCACCTGAGC CTTTGCGCCT TCAGAGAAGA ACATGCCGCA AGACTTGCCG 
22981 GAAAACTGAT TGGCCGGACA GGCCGCGTCG TGCACGCAGC ACCTTGCGTC GGTGTTGGAG 
23041 ATCTGCACCA CATTTCGGCC CCACCGGTTC TTCACGATCT TGGCCTTGCT AGACTGCTCC 
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23101 TTCAGCGCGC GCTGCCCGTT TTCGCTCGTC ACATCCATTT CAATCACGTG CTCCTTATTT 
23161 ATCATAATGC TTCCGTGTAG ACACTTAAGC TCGCCTTCGA TCTCAGCGCA GCGGTGCAGC 
23221 CACAACGCGC AGCCCGTGGG CTCGTGATGC TTGTAGGTCA CCTCTGCAAA CGACTGCAGG 
23281 TACGCCTGCA GGAATCGCCC CATCATCGTC ACAAAGGTCT TGTTGCTGGT GAAGGTCAGC 
23341 TGCAACCCGC GGTGCTCCTC GTTCAGCCAG GTCTTGCATA CGGCCGCCAG AGCTTCCACT 
23401 TGGTCAGGCA GTAGTTTGAA GTTCGCCTTT AGATCGTTAT CCACGTGGTA CTTGTCCATC 
23461 AGCGCGCGCG CAGCCTCCAT GCCCTTCTCC CACGCAGACA CGATCGGCAC ACTCAGCGGG 
23521 TTCATCACCG TAATTTC AC T TTCCGCTTCG CTGGGCTCTT CCTCTTCCTC TTGCGTCCGC 
23581 ATACCACGCG CCACTGGGTC GTCTTCATTC AGCCGCCGCA CTGTGCGCTT ACCTCCTTTG 
23641 CCATGCTTGA TTAGCACCGG TGGGTTGCTG AAACCCACCA TTTGTAGCGC CACATCTTCT 
23701 CTTTCTTCCT CGCTGTCCAC GATTACCTCT GGTGATGGCG GGCGGTCGGG CTTGGGAGAA 
23761 GGGCGCTTCT TTTTCTTCTT GGGCGCAATG GCCAAATCCG CCGCCGAGGT CGATGGCCGC 
23821 GGGCTGGGTG TGCGCGGCAC CAGCGCGTCT TGTGATGAGT CTTCCTCGTC CTCGGACTCG 
23881 ATACGCCGCC TCATCCGCTT TTTTGGGGGC GCCCGGGGAG GCGGCGGCGA CGGGGACGGG 
23941 GACGACACGT CCTCCATGGT TGGGGGACGT CGCGCCGCAC CGCGTCCGCG CTCGGGGGTG 
24001 GTTTCGCGCT GCTCCTCTTC CCGACTGGCC ATTTCCTTCT CCTATAGGCA GAAAAAGATC 
24061 ATGGAGTCAG TCGAGAAGAA GGACAGCCTA ACCGCCCCCT CTGAGTTCGC CACCACCGCC 
24121 TCCACCGATG CCGCCAACGC GCCTACCACC TTCCCCGTCG AGGCACCCCC GCTTGAGGAG 
24181 GAGGAAGTGA TTATCGAGCA GGACCCAGGT TTTGTAAGCG AAGACGACGA GGACCGCTCA 
24241 GTACCAACAG AGGATAAAAA GCAAGACCAG GACAACGCAG AGGCAAACGA GGAACAAGTC 
24301 GGGCGGGGGG ACGAAAGGCA TGGCGACTAC CTAGATGTGG GAGACGACGT GCTGTTGAAG 
24361 CATCTGCAGC GCCAGTGCGC CATTATCTGC GACGCGTTGC AAGAGCGCAG CGATGTGCCC 
24421 CTCGCCATAG CGGATGTCAG CCTTGCCTAC GAACGCCACC TATTCTCACC GCGCGTACCC 
24481 CCCAAACGCC AAGAAAACGG CACATGCGAG CCCAACCCGC GCCTCAACTT CTACCCCGTA 
24541 TTTGCCGTGC CAGAGGTGCT TGCCACCTAT CACATCTTTT TCCAAAACTG CAAGATACCC 
24601 CTATCCTGCC GTGCCAACCG CAGCCGAGCG GACAAGCAGC TGGCCTTGCG GCAGGGCGCT 
24661 GTCATACCTG ATATCGCCTC GCTCAACGAA GTGCCAAAAA TCTTTGAGGG TCTTGGACGC 
24721 GACGAGAAGC GCGCGGCAAA CGCTCTGCAA CAGGAAAACA GCGAAAATGA AAGTCACTCT 
24781 GGAGTGTTGG TGGAACTCGA GGGTG ACAAC GCGCGCCTAG CCGTACTAAA ACGCAGCATC 
24841 GAGGTCACCC ACTTTGCCTA CCCGGCACTT AACCTACCCC CCAAGGTCAT GAGCACAGTC 
24901 ATGAGTGAGC TGATCGTGCG CCGTGCGCAG CCCCTGGAGA GGGATGCAAA TTTGCAAGAA 
24961 CAAACAGAGG AGGGCCTACC CGCAGTTGGC GACGAGCAGC TAGCGCGCTG GCTTCAAACG 
25021 CGCGAGCCTG CCGACTTGGA GGAGCGACGC AAACTAATGA TGGCCGCAQT GCTCGTTACC 
25081 GTGGAGCTTG AGTGCATGCA GCGGTTCTTT GCTGACCCGG AGATGCAGCG CAAGCTAGAG 
25141 GAAACATTGC ACTACACCTT TCGACAGGGC TACGTACGCC AGGCCTGCAA GATCTCCAAC 
25201 GTGGAGCTCT GCAACCTGGT CTCCTACCTT GGAATTTTGC ACGAAAACCG CCTTGGGCAA 
25261 AACGTGCTTC ATTCCACGCT CAAGGGCGAG GCGCGCCGCG ACTACGTCCG CGACTGCGTT 
25321 TACTTATTTC T ATGCT AC AC CTGGCAGACG GCCATGGGCG TTTGGCAGCA GTGCTTGGAG 
25381 GAGTGCAACC TCAAGGAGCT GCAGAAACTG CTAAAGCAAA ACTTGAAGGA CCTATGGACG 
25441 GCCTTCAACG AGCGCTCCGT GGCCGCGCAC CTGGCGGACA TCATTTTCCC CGAACGCCTG 
25501 CTTAAAACCC TGCAACAGGG TCTGCCAGAC TTCACCAGTC AAAGCATGTT GCAGAACTTT 
25561 AGGAACTTTA TCCTAGAGCG CTCAGGAATC TTGCCCGCCA CCTGCTGTGC ACTTCCTAGC 
25621 GACTTTGTGC CCATTAAGTA CCGCGAATGC CCTCCGCCGC TTTGGGGCCA CTGCTACCTT 
25681 CTGCAGCTAG CCAACTACCT TGCCTACCAC TCTGACATAA TGGAAGACGT GAGCGGTGAC 
25741 GGTCTACTGG AGTGTCACTG TCGCTGCAAC CTATGCACCC CGCACCGCTC CCTGGTTTGC 
25801 AATTCGCAGC TGCTTAACGA AAGTCAAATT ATCGGTACCT TTGAGCTGCA GGGTCCCTCG 
25861 CCTGACGAAA AGTCCGCGGC TCCGGGGTTG AAACTCACTC CGGGGCTGTG GACGTCGGCT 
25921 TACCTTCGCA AATTTGTACC TGAGGACTAC CACGCCCACG AGATTAGGTT CTACGAAGAC 
25981 CAATCCCGCC CGCCAAATGC GGAGCTTACC GCCTGCGTCA TTACCCAGGG CCACATTCTT 
26041 GGCCAATTGC AAGCCATCAA CAAAGCCCGC CAAGAGTTTC TGCTACGAAA GGGACGGGGG 
26101 GTTTACTTGG ACCCCCAGTC CGGCGAGGAG CTCAACCCAA TCCCCCCGCC GCCGCAGCCC 
26161 TATCAGCAGC AGCCGCGGGC CCTTGCTTCC CAGGATGGCA CCCAAAAAGA AGCTGCAGCT 
26221 GCCGCCGCCA CCCACGGACG AGGAGGAATA CTGGGACAGT CAGGCAGAGG AGGTTTTGGA 
26281 CGAGGAGGAG GAGGACATGA TGGAAGACTG GGAGAGCCTA GACGAGGAAG CTTCCGAGGT 
26341 CGAAGAGGTG TCAGACGAAA CACCGTCACC CTCGGTCGCA TTCCCCTCGC CGGCGCCCCA 
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26401 GAAATCGGCA ACCGGTTCCA GCATGGCTAC AACCTCCGCT CCTCAGGCGC CGCCGGCACT 
26461 GCCCGTTCGC CGACCCAACC GTAGATGGGA CACCACTGGA ACCAGGGCCG GTAAGTCCAA* 
26521 GCAGCCGCCG CCGTTAGCCC AAGAGCAACA ACAGCGCCAA GGCTACCGCT CATGGCGCGG 
26581 GCACAAGAAC GCCATAGTTG CTTGCTTGCA AGACTGTGGG GGCAACATCT CCTTCGCCCG 
26641 CCGCTTTCTT CTCTACCATC ACGGCGTGGC CTTCCCCCGT AACATCCTGC ATTACTACCG 
26701 TCATCTCTAC AGCCCATACT GCACCGGCGG CAGCGGCAGC GGCAGCAACA GCAGCGGCCA 
26761 CACAGAAGCA AAGGCGACCG GATAGCAAGA CTCTGACAAA GCCCAAGAAA TCCACAGCGG 
26821 CGGCAGCAGC AGGAGGAGGA GCGCTGCGTC TGGCGCCCAA CGAACCCGTA TCGACCCGCG 
26881 AGCTTAGAAA CAGGATTTTT CCCACTCTGT ATGCTATATT TCAACAGAGC AGGGGCCAAG 
26941 AACAAGAGCT GAAAATAAAA AACAGGTCTC TGCGATCCCT CACCCGCAGC TGCCTGTATC 
27001 ACAAAAGCGA AGATCAGCTT CGGCGCACGC TGGAAGACGC GGAGGCTCTC TTCAGTAAAT 
27061 ACTGCGCGCT GACTCTTAAG GACTAGTTTC GCGCCCTTTC TCAAATTTAA GCGCGAAAAC 
27121 TACGTCATCT CCAGCGGCCA CACCCGGCGC CAGCACCTGT CGTCAGCGCC ATTATGAGCA 
27181 AGGAAATTCC CACGCCCTAC ATGTGGAGTT ACCAGCCACA AATGGGACTT GCGGCTGGAG 
27241 CTGCCCAAGA CTACTCAACC CGAATAAACT ACATGAGCGC GGGACCCCAC ATGATATCCC 
27301 GGGTCAACGG AATCCGCGCC CACCGAAACC GAATTCTCTT GGAACAGGCG GCTATTACCA 
27361 CCACACCTCG TAATAACCTT AATCCCCGTA GTTGGCCCGC TGCCCTGGTG TACCAGGAAA 
27421 GTCCCGCTCC CACCACTGTG GTACTTCCCA GAGACGCCCA GGCCGAAGTT CAGATGACTA 
27481 ACTCAGGGGC GCAGCTTGCG GGCGGCTTTC GTCACAGGGT GCGGTCGCCC GGGCAGGGTA 
27541 TAACTCACCT GACAATCAGA GGGCGAGGTA TTCAGCTCAA CGACGAGTCG GTGAGCTCCT 
27601 CGCTTGGTCT CCGTCCGGAC GGGACATTTC AGATCGGCGG CGCCGGCCGT CCTTCATTCA 
27661 CGCCTCGTCA GGCAATCCTA ACTCTGCAGA CCTCGTCCTC TGAGCCGCGC TCTGGAGGCA 
27721 TTGGAACTCT GCAATTTATT GAGGAGTTTG TGCCATCGGT CTACTTTAAC CCCTTCTCGG 
27781 GACCTCCCGG CCACTATCCG GATCAATTTA TTCCTAACTT TGACGCGGTA AAGGACTCGG 
27841 CGGACGGCTA CGACTGAATG TTAAGTGGAG AGGCAGAGCA ACTGCGCCTG AAACACCTGG 
27901 TCCACTGTCG CCGCCACAAG TGCTTTGCCC GCGACTCCGG TGAGTTTTGC TACTTTGAAT 
27961 TGCCCGAGGA TCATATCGAG GGCCCGGCGC ACGGCGTCCG GCTTACCGCC CAGGGAGAGC 
28021 TTGCCCGTAG CCTGATTCGG GAGTTTACCC AGCGCCCCCT GCTAGTTGAG CGGGACAGGG 
28081 GACCCTGTGT TCTCACTGTG ATTTGCAACT GTCCTAACCT TGGATTACAT CAAGATCTTT 
28141 GTTGCCATCT CTGTGCTGAG TATAATAAAT ACAGAAATTA AAATATACTG GGGCTCCTAT 
28201 CGCCATCCTG TAAACGCCAC CGTCTTCACC CGCCCAAGCA AACCAAGGCG AACCTTACCT 
28261 GGTACTTTTA AGATCTCTCC CTCTGTGATT TACAACAGTT TCAACCCAGA CGGAGTGAGT 
28321 CTACGAGAGA ACCTCTCCGA GCTCAGCTAC TCCATCAGAA AAAACACCAC. CCTCCTTACC 
28381 TGCCGGGAAC GTACGAGTGC GTCACCGGCC GCTGCACCAC ACCTACCGCC TGACCGTAAA 
28441 CCAGACTTTT TCCGGACAGA CCTCAATAAC TCTGTTTACC AGAACAGGAG GTGAGCTTAG 
28501 AAAACCCTTA GGGTATTAGG CCAAAGGCGC AGCTACTGTG GGGTTTATGA ACAATTCAAG 
28561 CAACTCTACG GGCTATTCTA ATTCAGGTTT CTCTAGAATC GGGGTTGGGG TTATTCTCTG 
28621 TCTTGTGATT CTCTTTATTC TTATACTAAC GCTTCTCTGC CTAAGGCTCG CCGCCTGCTG 
28681 TGTGCACATT TGCATTTATT GTCAGCTTTT TAAACGCTGG GGTCGCCACC CAAGATGATT 
28741 AGGTACATAA TCCTAGGTTT ACTCACCCTT GCGTCAGCCC ACGGTACCAC CCAAAAGGTG 
28801 GATTTTAAGG AGCCAGCCTG TAATGTTACA TTCGCAGCTG AAGCTAATGA GTGCACCACT 
28861 CTTATAAAAT GCACCACAGA ACATGAAAAG CTGCTTATTC GCCACAAAAA CAAAATTGGC 
28921 AAGTATGCTG TTTATGCTAT TTGGCAGCCA GGTGACACTA CAGAGTATAA TGTTACAGTT 
28981 TTCCAGGGTA AAAGTCATAA AACTTTTATG TATACTTTTC CATTTTATGA AATGTGCGAC 
29041 ATTACCATGT ACATGAGCAA ACAGTATAAG TTGTGGCCCC CACAAAATTG TGTGGAAAAC 
29101 ACTGGCACTT TCTGCTGCAC TGCTATGCTA ATTACAGTGC TCGCTTTGGT CTGTACCCTA 
29161 CTCTATATTA AATACAAAAG CAGACGCAGC TTTATTGAGG AAAAGAAAAT GCCTTAATTT 
29221 ACTAAGTTAC AAAGCTAATG TCACCACTAA CTGCTTTACT CGCTGCTTGC AAAACAAATT 
29281 CAAAAAGTTA GCATTATAAT TAGAATAGGA TTTAAACCCC CCGGTCATTT CCTGCTCAAT 
29341 ACCATTCCCC TGAACAATTG ACTCTATGTG GGATATGCTC CAGCGCTACA ACCTTGAAGT 
29401 CAGGCTTCCT GGATGTCAGC ATCTGACTTT GGCCAGCACC TGTCCCGCGG ATTTGTTCCA 
29461 GTCCAACTAC AGCGACCCAC CCTAACAGAG ATGACCAACA CAACCAACGC GGCCGCCGCT 
29521 ACCGGACTTA CATCTACCAC AAATACACCC CAAGTTTCTG CCTTTGTCAA TAACTGGGAT 
29581 AACTTGGGCA TGTGGTGGTT CTCCATAGCG CTTATGTTTG TATGCCTTAT TATTATGTGG 
29641 CTCATCTGCT GCCTAAAGCG CAAACGCGCC CGACCACCCA TCTATAGTCC C ATCATTGTG 
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29701 CTACACCCAA ACAATGATGG AATCCATAGA TTGGACGGAC TGAAACACAT GTTCTTTTCT 
29761 CTTACAGTAT GATTAAATGA GACATGATTC CTCGAGTTTT TATATTACTG ACCCTTGTTG 
29821 CGCTTTTTTG TGCGTGCTCC ACATTGGCTG CGGTTTCTCA CATC GAAGT A GACTGCATTC 
29881 CAGCCTTCAC AGTCTATTTG CTTTACGGAT TTGTCACCCT CACGCTCATC TGCAGCCTCA 
29941 TCACTGTGGT CATCGCCTTT ATCCAGTGCA TTGACTGGGT CTGTGTGCGC TTTGCATATC 
30001 TCAGACACCA TCCCCAGTAC AGGGACAGGA CTATAGCTGA GCTTCTTAGA ATTCTTTAAT 
30061 TATGAAATTT ACTGTGACTT TTCTGCTGAT TATTTGCACC CTATCTGCGT TTTGTTCCCC 
30121 GACCTCCAAG CCTCAAAGAC ATATATCATG CAGATTCACT CGTATATGGA ATATTCCAAG 
30181 TTGCTACAAT GAAAAAAGCG ATCTTTCCGA AGCCTGGTTA TATGCAATCA TCTCTGTTAT 
30241 GGTGTTCTGC AGTACCATCT TAGCCCTAGC TATATATCCC TACCTTGACA TTGGCTGGAA 
303 01 ACGAATAGAT GCCATGAACC ACCCAACTTT CCCCGCGCCC GCTATGCTTC CACTGCAACA 
303 61 AGTTGTTGCC GGCGGCTTTG TCCCAGCCAA TCAGCCTCGC CCCACTTCTC CCACCCCCAC 
30421 TGAAATCAGC TACTTTAATC TAACAGGAGG AGATGACTGA CACCCTAGAT CTAGAAATGG 
30481 ACGGAATTAT TACAGAGCAG CGCCTGCTAG AAAGACGCAG GGCAGCGGCC GAGCAACAGC 
30541 GCATGAATCA AGAGCTCCAA GACATGGTTA ACTTGCACCA GTGCAAAAGG GGTATCTTTT 
30601 GTCTGGTAAA GCAGGCCAAA GTCACCTACG ACAGTAATAC CACCGGACAC CGCCTTAGCT 
30661 ACAAGTTGCC AACCAAGCGT CAGAAATTGG TGGTCATGGT GGGAGAAAAG CCCATTACCA 
30721 TAACTCAGCA CTCGGTAGAA ACCGAAGGCT GCATTCACTC ACCTTGTCAA GGACCTGAGG 
30781 ATCTCTGCAC CCTTATTAAG ACCCTGTGCG GTCTCAAAGA TCTTATTCCC TTTAACTAAT 
30841 AAAAAAAAAT AATAAAGCAT CACTTACTTA AAATCAGTTA GCAAATTTCT GTCCAGTTTA 
30901 TTCAGCAGCA CCTCCTTGCC CTCCTCCCAG CTCTGGTATT GCAGCTTCCT CCTGGCTGCA 
30961 AACTTTCTCC ACAATCTAAA TGGAATGTCA GTTTCCTCCT GTTCCTGTCC ATCCGCACCC 
31021 ACTATCTTCA TGTTGTTGCA GATGAAGCGC GCAAGACCGT CTGAAGATAC CTTCAACCCC 
31081 GTGTATCCAT ATGACACGGA AACCGGTCCT CCAACTGTGC CTTTTCTTAC TCCTCCCTTT 
31141 GTATCCCCCA ATGGGTTTCA AGAGAGTCCC CCTGGGGTAC TCTCTTTGCG CCTATCCGAA 
31201 CCTCTAGTTA CCTCCAATGG CATGCTTGCG CTCAAAATGG GCAACGGCCT CTCTCTGGAC 
312 61 GAGGCCGGCA ACCTTACCTC CCAAAATGTA ACCACTGTGA GCCCACCTCT CAAAAAAACC 
31321 AAGTCAAACA TAAACCTGGA AATATCTGCA CCCCTCACAG TTACCTCAGA AGCCCTAACT 
31381 GTGGCTGCCG CCGCACCTCT AATGGTCGCG GGCAACACAC TCACCATGCA ATCACAGGCC 
31441 CCGCTAACCG TGCACGACTC CAAACTTAGC ATTGCCACCC AAGGACCCCT CACAGTGTCA 
31501 GAAGGAAAGC TAGCCCTGCA AACATCAGGC CCCCTCACCA CCACCGATAG CAGTACCCTT 
31561 ACTATCACTG CCTCACCCCC TCTAACTACT GCCACTGGTA GCTTGGGCAT TGACTTGAAA 
31621 GAGCCCATTT ATACACAAAA TGGAAAACTA GGACTAAAGT ACGGGGCTCC TTTGCATGTA 
31681 ACAGACGACC TAAACACTTT GACCGTAGCA ACTGGTCCAG GTGTGACTAT TAATAATACT 
31741 TCCTTGCAAA CTAAAGTTAC TGGAGCCTTG GGTTTTGATT CACAAGGCAA TATGCAACTT 
31801 AATGTAGCAG GAGGACTAAG GATTGATTCT CAAAACAGAC GCCTTATACT TGATGTTAGT 
31861 TATCCGTTTG ATGCTCAAAA CCAACTAAAT CTAAGACTAG GACAGGGCCC TCTTTTTATA 
31921 AACTCAGCCC ACAACTTGGA TATTAACTAC AACAAAGGCC TTTACTTGTT TACAGCTTCA 
31981 AACAATTCCA AAAAGCTTGA GGTTAACCTA AGCACTGCCA AGGGGTTGAT GTTTGACGCT 
32041 ACAGCCATAG CCATTAATGC AGGAGATGGG CTTGAATTTG GTTCACCTAA TGCACCAAAC 
32101 ACAAATCCCC TCAAAACAAA AATTGGCCAT GGCCTAGAAT TTGATTCAAA CAAGGCTATG 
32161 GTTCCTAAAC TAGGAACTGG CCTTAGTTTT GACAGCACAG GTGCCATTAC AGTAGGAAAC 
32221 AAAAATAATG ATAAGCTAAC TTTGTGGACC ACACCAGCTC C ATCTCC T AA CTGTAGACTA 
32281 AATGCAGAGA AAGATGCTAA ACTCACTTTG GTCTTAACAA AATGTGGCAG TCAAATACTT 
32341 GCTACAGTTT CAGTTTTGGC TGTTAAAGGC AGTTTGGCTC CAATATCTGG AACAGTTCAA 
32401 AGTGCTCATC TTATTATAAG ATTTGACGAA AATGGAGTGC TACTAAACAA TTCCTTCCTG 
32461 GACCCAGAAT ATTGGAACTT TAGAAATGGA GATCTTACTG AAGGCACAGC CTATACAAAC 
32521 GCTGTTGGAT TTATGCCTAA CCTATCAGCT TATCCAAAAT CTCACGGTAA AACTGCCAAA 
32581 AGTAACATTG TCAGTCAAGT TTACTTAAAC GGAGACAAAA CTAAACCTGT AACACTAACC 
32641 ATTACACTAA ACGGTACACA GGAAACAGGA GACACAACTC CAAGTGCATA CTCTATGTCA 
32701 TTTTCATGGG ACTGGTCTGG CCACAACTAC ATTAATGAAA TATTTGCC AC ATCCTCTTAC 
32761 ACTTTTTCAT ACATTGCCCA AGAATAAAGA ATCGTTTGTG TTATGTTTCA ACGTGTTTAT 
32821 TTTTCAATTG CAGAAAATTT CAAGTCATTT TTCATTCAGT AGTATAGCCC CACCACCACA 
32881 TAGCTTATAC AGATCACCGT ACCTTAATCA AACTCACAGA ACCCTAGTAT TCAACCTGCC 
32941 ACCTCCCTCC CAACACACAG AGTACACAGT CCTTTCTCCC CGGCTGGCCT TAAAAAGCAT 
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33001 CATATCATGG GTAACAGACA TATTCTTAGG TGTTATATTC CACACGGTTT CCTGTCGAGC 
33061 CAAACGCTCA TCAGTGATAT TAATAAACTC CCCGGGCAGC TCACTTAAGT TCATGTCGCT 
33121 GTCCAGCTGC TGAGCCACAG GCTGCTGTCC AACTTGCGGT TGCTTAACGG GCGGCGAAGG 
33181 AGAAGTCCAC GCCTACATGG GGGTAGAGTC ATAATCGTGC ATCAGGATAG GGCGGTGGTG 
33241 CTGCAGCAGC GCGCGAATAA ACTGCTGCCG CCGCCGCTCC GTCCTGCAGG AATACAACAT 
33301 GGCAGTGGTC TCCTCAGCGA TGATTCGCAC CGCCCGCAGC ATAAGGCGCC TTGTCCTCCG 
33361 GGCACAGCAG CGCACCCTGA TCTCACTTAA ATCAGCACAG TAACTGCAGC ACAGCACCAC 
33421 AATATTGTTC AAAATCCCAC AGTGCAAGGC GCTGTATCCA AAGCTCATGG CGGGGACCAC 
33481 AGAACCCACG TGGCCATCAT ACCACAAGCG CAGGTAGATT AAGTGGCGAC CCCTCATAAA 
33541 CACGCTGGAC ATAAACATTA CCTCTTTTGG CATGTTGTAA TTCACCACCT CCCGGTACCA,. 
33601 TATAAACCTC TGATTAAACA TGGCGCCATC CACCACCATC CTAAACCAGC TGGCCAAAAC 
33661 CTGCCCGCCG GCTATACACT GCAGGGAACC GGGACTGGAA CAATGACAGT GGAGAGCCCA 
33721 GGACTCGTAA CCATGGATCA TCATGCTCGT CATGATATCA ATGTTGGCAC AAC ACAGGCA 
33781 CACGTGCATA CACTTCCTCA GGATTACAAG CTCCTCCCGC GTTAGAACCA TATCCC AGGG 
33841 AACAACCCAT TCCTGAATCA GCGTAAATCC CACACTGCAG GGAAGACCTC GCACGTAACT 
33901 CACGTTGTGC ATTGTCAAAG TGTTACATTC GGGCAGCAGC GGATGATCCT CCAGTATGGT 
33961 AGCGCGGGTT TCTGTCTCAA AAGGAGGTAG ACGATCCCTA CTGTACGGAG TGCGCCGAGA 
34021 CAACCGAGAT CGTGTTGGTC GTAGTGTCAT GCCAAATGGA ACGCCGGACG TAGTC ATATT 
34081 TCCTGAAGCA AAACCAGGTG CGGGCGTGAC AAACAGATCT GCGTCTCCGG TCTCGCCGCT 
34141 TAGATCGCTC TGTGTAGTAG TTGTAGTATA TCCACTCTCT CAAAGCATCC AGGCGCCCCC 
34201 TGGCTTCGGG TTCTATGTAA ACTCCTTCAT GCGCCGCTGC CCTGATAACA TCCACCACCG 
34261 CAGAATAAGC CACACCCAGC CAACCTACAC ATTCGTTCTG CGAGTCACAC ACGGGAGGAG 
34321 CGGGAAGAGC TGGAAGAACC ATGTTTTTTT TTTTATTCCA AAAGATTATC CAAAACCTCA 
34381 AAATGAAGAT CTATTAAGTG AACGCGCTCC CCTCCGGTGG CGTGGTCAAA CTCTACAGCC 
34441 AAAGAACAGA TAATGGCATT TGTAAGATGT TGCACAATGG CTTCCAAAAG GCAAACGGCC 
34501 CTCACGTCCA AGTGGACGTA AAGGCTAAAC CCTTCAGGGT GAATCTCCTC TATAAACATT 
34561 CCAGCACCTT CAACCATGCC CAAATAATTC TCATCTCGCC ACCTTCTCAA TATATCTCTA 
34621 AGCAAATCCC GAATATTAAG TCCGGCCATT GTAAAAATCT GCTCCAGAGC GCCCTCCACC 
34681 TTCAGCCTCA AGCAGCGAAT CATGATTGCA AAAATTCAGG TTCCTCACAG AC CTGTATAA 
34741 GATTCAAAAG CGGAACATTA ACAAAAATAC CGCGATCCCG TAGGTCCCTT CGCAGGGCCA 
34801 GCTGAACATA ATCGTGCAGG TCTGCACGGA CCAGCGCGGC CACTTCCCCG CCAGGAACCT 
34861 TGACAAAAGA ACCCACACTG ATTATGACAC GCATACTCGG AGCTATGCTA AC CAGCGT AG 
34921 CCCCGATGTA AGCTTTGTTG CATGGGCGGC GATATAAAAT GCAAGGTGCT GCTCAAAAAA 
34981 TCAGGCAAAG CCTCGCGCAA AAAAGAAAGC ACATCGTAGT CATGCTCATG CAGATAAAGG 
35041 CAGGTAAGCT CCGGAACCAC CACAGAAAAA GACACCATTT TTCTCTCAAA CATGTCTGCG 
35101 GGTTTCTGCA TAAACACAAA ATAAAATAAC AAAAAAACAT TTAAACATTA GAAGCCTGTC 
35161 TTACAACAGG AAAAACAACC CTTATAAGCA TAAGACGGAC TACGGCCATG CCGGCGTGAC 
35221 CGTAAAAAAA CTGGTCACCG TGATTAAAAA GCACCACCGA CAGCTCCTCG GTCATGTCCG 
35281 GAGTCATAAT GTAAGACTCG GTAAACACAT CAGGTTGATT CATCGGTCAG TGCTAAAAAG 
35341 CGACCGAAAT AGCCCGGGGG AATACATACC CGCAGGCGTA GAGACAACAT TACAGCCCCC 
35401 ATAGGAGGTA TAACAAAATT AATAGGAGAG AAAAACACAT AAACACCTGA AAAACCCTCC 
35461 TGCCTAGGCA AAATAGCACC CTCCCGCTCC AGAACAACAT ACAGCGCTTC ACAGCGGCAG 
35521 CCTAACAGTC AGCCTTACCA GTAAAAAAGA AAACCTATTA AAAAAACACC ACTCGACACG 
35581 GCACCAGCTC AATCAGTCAC AGTGTAAAAA AGGGCCAAGT GCAGAGCGAG TATATATAGG 
35641 ACTAAAAAAT GACGTAACGG TTAAAGTCCA CAAAAAACAC CCAGAAAACC GCACGCGAAC 
35701 CTACGCCCAG AAACGAAAGC CAAAAAACCC ACAACTTCCT CAAATCGTCA CTTCCGTTTT 
35761 CCCACGTTAC GTAACTTCCC ATTTTAAGAA AACTACAATT CCCAACACAT ACAAGTTACT 
35821 CCGCCCTAAA ACCTACGTCA CCCGCCCCGT TCCCACGCCC CGCGCCACGT CACAAACTCC 
35881 ACCCCCTCAT TATCATATTG GCTTCAATCC AAAATAAGGT ATATTATTGA TGATG 
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Western blot on whole-cell extracts from 293 cells transfected with plasmid DNA expressing the 
different HCV NS cassettes. Mature NS3 and NS5A products were detected with specific antibodies. 
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IFNy ELIspot on splenocytes from C57black6 mice immunized with two injections of 25u.g DNA/dose with 
GET of plasmid vectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC 
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EFNy ELIspot on splenocytes from BalbC mice immunized with two injections of 5Q\ig DNA/dose with 
GET of plasmid vectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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Western blot on whole-cell extracts from HeLa cells infected at different multiplicity of infection 
(m.o.i.; indicated at the top) with Adenovectors expressing the different HCV NS cassettes. Mature 
NS5B and NS5A products were detected with specific antibodies. 
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IFNy ELISPOT on splenocytes from C57b1ack6 mice immunized with two injections of 10 9 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNy ELISPOT on PBMC from Rhesus monkeys immunized with one injection of 10 10 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC 



FIG. 16A 



WO03/031588 



77/92 



PCT/US02/32512 





MRK Ad5-NSmut 1 0 1 0 vp/dose 


Pep pools 


S201 


075Q 


137Q 


F(NS3p) 


928 


69 


254 


G(NS3h) 


317 


436 


98 


H(NS4) 


56 


101 


45 


I(NS5a) 


1530 


1100 


413 


L(NS5b) 


149 


23 


92 


M(NS5b) 


398 


32 


80 


DMSO 


29 


6 


29 












MRK Ad6-NSOPTmut 10 1 ° 


vp/dose 


Pep pools 


98D209 


106Q 


1I3Q 


F(NS3p) 


3110 


2S3 


404 


G(NS3h) 


2115 


642 


1008 


H(NS4) 


373 


72 


19 


l(NS5a) 


103 


37 


347 


L(NS5b) 


149 


22 


10 


M(NS5b) 


314 


428 


19 


DMSO 


0 


1 


3 



EFNy EL1SPOT on PBMC from Rhesus monkeys immunized with one injection of 10 10 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFCVIO^ PBMC 



FIG. 16B 
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IFNy ELISPOT on PBMC from Rhesus monkeys immunized with two injections of 10 M vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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EFNy ELISPOT on PBMC from Rhesus monkeys immunized with two injections of 10" vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10?PBMC. 
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1FNY ICS on PBMC from Rhesus monkeys immunized with two injections at four weeks interval with 10 vp/dose 
of Adenovectors expressing the different HCV NS cassettes. Data are expressed as number of positive 
!FNy/CD3/CD8 per 10 6 lymphocytes. 
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IFNy ICS on PBMC from Rhesus monkeys immunized with two injections at four weeks interval with 10 u vp/dose 
of Adenovectors expressing the different HCV NS cassettes. Data are expressed as number of positive 
IFNY/CD3/CD8 per 10* lymphocytes- 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 u vp/dose of Ad5-NS. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 1 1 vp/dose of Ad5-NS. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 1 1 vp/dose of MRKAdS-NSmut 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 M vp/dose of MRKAdS-NSmut 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10" vp/dose of MRKAd6-NSmuL 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 u vp/doseof MRKAd6-NSihui 
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2351 TGCTGTTCAA CATTCTGGGC GGATGGGTGG CCGCTCAGCT GGCCCCTCCT 
2401 TCAGCTGCTT CTGCCTTTGT GGGCGCTGGC ATTGCCGGAG CCGCTGTGGG 
2451 CAGCATTGGC CTGGGCAAAG TGCTGGTGGA TATTCTGGCT GGCTATGGCG 
2501 CTGGCGTGGC CGGAGCCCTG GTGGCCTTCA AGGTGATGAG CGGAGAGATG 
2551 CCCAGCACCG AGGACCTGGT GAACCTGCTG CCTGCCATTC TGAGCCCTGG 
2601 AGCCCTGGTG GTGGGCGTGG TGTGTGCTGC CATTCTGAGG CGCCATGTGG 
2651 GACCCGGAGA GGGCGCTGTG CAGTGGATGA ACCGCCTGAT CGCCTTCGCC 
2701 TCTCGCGGAA ACCACGTGAG CCCTACCCAC TACGTGCCTG AGAGCGACGC 
2751 CGCTGCCAGG GTGACCCAGA TCCTGAGCAG CCTGACCATC ACCCAGCTGC 
2801 TGAAGCGCCT GCACCAGTGG ATCAACGAGG ACTGCAGCAC ACCCTGCAGC 
2851 GGAAGCTGGC TGAGGGACGT GTGGGACTGG ATCTGCACCG TGCTGACCGA 
2901 CTTCAAGACC TGGCTGCAGA GCAAGCTGCT GCCCCAACTG CCTGGCGTGC 
2951 CCTTCTTCTC ATGCCAGCGC GGATACAAGG GCGTGTGGAG GGGCGATGGC 
3001 ATCATGCAGA CCACCTGTCC CTGCGGAGCC CAGATCACAG GCCACGTGAA 
3051 GAACGGCAGC ATGCGCATCG TGGGCCCTAA GACCTGCAGC AACACCTGGC 
3101 ACGGCACCTT CCCCATCAAC GCCTACACCA CCGGACCCTG CACACCCAGC 
3151 CCTGCTCCCA ACTACAGCAG GGCCCTGTGG AGGGTGGCTG CCGAGGAGTA 
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3201 CGTGGAGGTG ACCAGGGTGG GAGACTTCCA CTACGTGACC GGAATGACCA 

3251 CCGACAACGT GAAGTGTCCC TGTCAGGTGC CCGCTCCCGA ATTTTTTACC 

.3301 GAAGTGGATG GCGTGCGCCT GCATCGCTAT GCCCCTGCCT GTAGGCCCCT 

3351 GCTGCGCGAA GAAGTGACCT TCCAGGTGGG CCTGAACCAG TACCTGGTGG 

3401 GCAGCCAGCT GCCCTGCGAG CCTGAGCCCG ATGTGGCCGT GCTGACCAGC 

3451 ATGCTGACCG ACCCCAGCCA CATCACAGCC GAAACCGCTA AAAGGCGCCT 

3501 GGCC AGGGGC TCTCCTCCAA GCCTGGCCTC AAGCAGCGCT AGCCAGCTGT 

3551 CTGCTCCCAG CCTGAAGGCC ACCTGCACCA CCCACCACGT GAGCCCCGAC 

3601 GCCGACCTGA TCGAGGCCAA CCTGCTGTGG CGCCAGGAGA TGGGCGGCAA 

3651 CATCACGCGC GTGGAGAGCG AGAACAAGGT GGTGGTGCTG GACAGCTTCG 

3701 ACCCCCTGCG CGCCGAGGAG GACGAGCGCG AGGTGAGCGT GCCCGCCGAG 

3751 ATCCTGCGCA AGAGCAAGAA GTTCCCCGCT GCCATGCCCA TCTGGGCTAG 

3801 ACCTGATTAC AACCCTCCCC TGCTGGAGAG CTGGAAGGAC CCTGATTACG 

3851 TGCCTCCAGT GGTGCATGGC TGTCCTCTGC CTCCCATTAA' AGCCCCTCCT 

3901 ATTCCACCTC CTAGGCGCAA AAGGACCGTG GTGCTGAGAG AAAGCAGCGT 

3951 GAGCTCTGCt CTGGCCGAAC TGGCCACCAA GACCTTTGGC AGCAGCGAGA 

4001 GCTCTGCCGT GGACAGCGGA AGAGCCACCG CTCTGCCTGA- CCAGGCCAGC 

4051 GAGGACGGCG ATAAGGGCAG CGATGTGGAG AGCTATAGCA GCATGCCTCC 

4101 CCTGGAAGGC GAACCTGGCG ATCdCGATCT GAGCGATGGC AGCTGGAGCA 

4151 CCGTGAGCGA AGAGGCCAGC GAGGACGTGG TGTGTTGCAG CATGAGCTAC 

4201 ACCTGGACAG GCGCTCTGAT CACACCCTGC GCTGCCGAGG AGAGCAAGCT 

4251 GCCCATCAAC GCCCTGAGCA ACAGCCTGCT GAGGGACCAC AACATGGTGT 

4301 ACGCCACCAC CAGCAGGtCT GCCGGACTGA' GGCAGAAGAA GGTGACCTTC 

4351 GACCGCCTGC AGGTGCTGGA CGACCACTAC CGCGATGTGC TGAAGGAGAT 

4401 GAAGGCCAAG GCCAGCACCG TGAAGGCCAA GCTGCTGAGC GTGGAGGAGG 

4451 CCTGCAAGCT GACCCCCCCC CACAGCGCCA AGAGCAAGTT CGGCTACGGC 

4501 GCCAAGGACG TGCGCAACCT GAGCAGCAAG GCCGTGAACC ACATCCACAG 

4551 CGTGTGGAAG GACCTGCTGG AGGACACCGT GACCCCCATC GACACCACCA 

4601 TCATGGCCAA GAACGAGGTG TTCTGCGTGC AGCCCGAGAA GGGCGGCCGC 

4651 AAGCCCGCTC GCCTGATCGT GTTCCCCGAT CTGGGCGTGC GCGTGTGCGA 

4701 GAAGATGGCC CTGTACGACG TGGTGAGCAC CCTGCCTCAG GTGGTGATGG 

4751 GCTCAAGCTA CGGCTTCCAG TACAGCCCTG GCCAGCGCGT GGAGTTCCTG 
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4801 GTGAACACCT GGAAGAGCAA GAAGAACCCC ATGGGCTTCA GCTACGACAC 
4851 ACGCTGCTTC GACAGCACCG TGACCGAGAA CGACATCCGC GTGGAGGAGA 
4901 GCATCTACCA GTGCTGCGAC CTGGCCCCTG AGGCCAGGCA GGCCATCAAG 
4951 AGCCTGACCG AGCGCCTGTA CATCGGAGGC CCTCTGACCA ACAGCAAGGG 
5001 ACAGAACTGC GGATACAGGC GCTGTAGGGC CTCTGGCGTG CTGACCACCA 
5051 GCTGTGGCAA CACCCTGACC TGCTACCTGA AGGCCAGCGC TGCCTGTCGC 
5101 GCTGCCAAGC TGCAGGACTG CACCATGCTG GTGAACGCCG CTGGCCTGGT 
5151 GGTGATTTGT GAAAGCGCTG GCACCCAGGA AGATGCTGCC AGCCTGCGCG 
5201 TGTTCACCGA GGCCATGACC AGGTACTCTG CCCCTCCCGG AGACCCCCCT 
5251 CAGCCCGAAT ACGACCTGGA GCTGATCACC AGCTGCTCAA GCAACGTGAG 
5301 CGTGGCTCAC GACGCCAGCG GAAAGCGCGT GTACTACCTG ACACGCGATC 
5351 CCACCACCCC TCTGGCTCGC GCTGCCTGGG AAACCGCTCG CCATACACCC 
5401 GTGAACAGCT GGCTGGGCAA CATCATCATG TACGCCCCTA CCCTGTGGGC 
5451 TCGCATGATC CTGATGACCC ACTTCTTCAG CATCCTGCTG GCTCAGGAGC 
5501 AGCTGGAGAA GGCCCTGGAC TGCCAGATTT ACGGCGCTTG CTACAGCATC 
5551 GAGCCCCTGG ACCTGCCCCA AATCATCGAG CGCCTGCACG GCCTGTCTGC 
5601 CTTCAGCCTG CACAGCTACA GCCCTGGCGA AATTAATCGC GTGGCCAGCT 
5651 GTCTGCGCAA ACTGGGCGTG CCTCCTCTGC GCGTGTGGAG GCATAGGGCT 
57 01 AGGAGCGTGA GGGCTAGGCT GCTGAGCCAG GGAGGCAGGG CCGCTACCTG 
5751 TGGAAAGTAC CTGTTCAACT GGGCCGTGAA GACCAAGCTG AAGCTGACCC 
5801 CTATCCCTGC CGCTAGCCAG CTGGACCTGA GCGGATGGTT CGTGGCTGGC 
5851 TACAGCGGAG GCGACATCTA CCACAGCCTG TCTCGCGCTC GCCCTCGCTG 
5901 GTTCATGCTG TGCCTGCTGC TGCTGAGCGT GGGCGTGGGC ATCTACCTGC 
5951 TGCCCAACCG CTAAA 
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SEQUENCE LISTING 

<110> Merck & Co. Inc., and Istituto Di Ricerche Di Biologia Molecolare P 
Angeletti S.P.A. 

<120> HEPATITIS C VIRUS VACCINE 

<130> ITR0015Y 

<150> 60/363,774 
<151> 2002-03-13 

<150> 60/328,655 
<151> 2001-10-11 

<160> 17 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1985 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide 
<400> 1 

Met Ala Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 

15 10 15 

Cys He lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 

20 25 30 

Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 

35 40 45 

Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 

50 55 60 

Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp 
65 70 75 80 

Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 

85 90 95 

Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 

100 105 HO 

Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 

115 120 125 

Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 

130 135 140 

Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys 
145 150 155 160 

Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 

165 170 175 

Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 

180 185 190 

Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 
195 200 205 
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Ser Gly Lys Ser Thr ^ys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 

210 215 220 

Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly* 
225 230 235 240 

" Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn lie Arg Thr Gly 
245 250 255 - 

Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly 

260 265 270 

Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He 

275 280 285 

He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He 

290 295 300 

-Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
305 310 315 320 

Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 

325 330 335 

lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly 

340 345 350 

Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu lie Phe 

355 360 365 

Cys His Ser Lys Lys Lys Cys Asp Giu Leu Ala Ala Lys Leu Ser Gly 

370 375 380 

Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 

He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 

405 410 415 

Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 

420 " 425 430 

Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 

435 440 445 

Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly • 

450 455 460 

Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly 
465 470 . 475 480 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 

485 490 495 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 

500 505 510 

Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leii Pro Val Cys Gin Asp 
515 520 525 

. His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp 
530 535 540 

Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
545 550 555 560 

Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 

565 570 575 

Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr 

580 585 590 

Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Glri Ash 

. 595 600 605 

Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys - Met 

610 615 620 

Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
625 630 635 . 640 
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Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val val 



Xie Val Gly Arg Xle Xle Leu Ser Gly Arg Pro Ala He Val Pro Asp 

680 

69S 

710 



xxe vdtx v3J.jr - — g 65 o/u 

^ Olu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 
„ is Leu So Tyr Xle Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 



Cln SI Ala ,eu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
725 ™° 



12 Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 
Ala Lys His Met £J Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 
Leu Ser Thr leu Pro Gly Asn Pro Ala Xle Ala Ser Leu Met Ala Phe 
A la Ser Xle Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 



xle Leu Gly Gly Trp III Ala Ala Gin Leu Ala Pro Pro Ser Ala 



12 ser Ala Phe Val Gly Ala Gly xle Ala Gly Ala Ala Val Gly Ser 



795 



810 



II. Gly Leu Gly Lys val Leu Val Asp Xle Leu Ala Gly Tyr Gly Ala 
Oly val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 
Pro Ser !£ Glu Asp Leu Val Asn Leu Leu Pro Ala Xle Leu Ser Pro 



Oly Ala Leu Val Val Gly Val Val Cys Ala Ala Xle Leu Arg Arg His 
Jal Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu Xle Ala 
Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 
Ser Asp Ala Ala Ala Arg Val Thr Gin Xle Leu Ser Ser Leu Thr Xle 
Thr Gin Leu Leu Lys Arg Leu His Gin Trp Xle Asn Glu Asp Cys Ser 
Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp Xle Cys 
!S Val Leu Thr Asp III Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro 
Gin Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly 
Val Trp Arg Gly Asp Gly Xle Met Gin Thr Thr Cys Pr^Cys Gly Ala 



Gin Xle Tnr Gly His Val Lys Asn Gly Ser Met ArgXle Val Gly Pro 
Lys Tnr°cys Ser Asn Thr Trp His Gly Thr Ph^Pro Xle Asn Ala Ty^ 
Th 2 r 5 Thr Gly Pro Cys Thr°Pro Ser Pro Ala Pro Asn Tyr Ser Ar^Ala 



1030 1° 35 
_ . _ Thr 1 

1045 10 ?°. 
f 

1060 



Leu Trp Arg Val Alf Ala Glu Glu Tyr^Val Glu Val Thr Ar^Val Gly 
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Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro 

1075 1080 1085 

Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg 

1090 1095 1100 

Leu His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val 
1105 1110 1115 1120 

Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro 

1125 H30 1135 

Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 

1140 1145 1150 

Pro Ser His lie Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly 

1155 1160 . 1165 

Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Glri Leu Ser Ala Pro 

1170 1175 1180 

Ser Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp 
1185 1190 1195 1200 

Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 

1205 1210 1215 

Thr Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp 

1220 1225 1230 

Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu 

1235 1240 1245 

lie Leu Arg Lys Ser Lys Lys Phe Pro Ala Ala Met Pro He Trp Ala 

1250 1255 1260 

Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp 
1265 1270 1275 1280 

Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro He Lys Ala 

1285 1290 1295 

Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu 

1300 1305 1310 

Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 

1315 1320 1325 

Ser Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro 

1330 1335 1340 

Asp Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr 
1345 1350 1355 1360 

Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 

1365 1370 1375 

Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 

1380 1385 1390 

Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys 

1395 1400 1405 

Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu 

• 1410 1415 1420 

Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly 
1425 1430 1435 1440 

Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 

1445 1450 1455 

His Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 

1460 1465 1470 

Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

1475 1480 1485 

His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Ash 
1490 1495 1500 
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Leu Ser Ser Lys Ala Val Asn His lie His Ser Val Trp Lys Asp Leu 
1505 1510 1515 1520 

Leu Glu Asp Thr Val Thr Pro lie Asp Thr Thr He Met Ala Lys Asn 

1525 1530 1535 

Glu val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 

1540 1545 1550 

Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

1555 1560 1565 

Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser 

1570 1575 1580 

Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 
1585 1590 1595 1600 

Thr Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg 

1605 1610 1615 

Cys Phe Asp Ser Thr Val Thr Glu Asn Asp lie Arg Val Glu Glu Ser 

1620 1625 1630 

He Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys 

1635 1640 1645 

Ser Leu Thr Glu Arg Leu Tyr lie Gly Gly Pro Leu Thr Asn Ser Lys 

1650 1655 1660 

Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 
1665 1670 1675 1680 

Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 

1685 1690 1695 

Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Ala Ala 

1700 1705 1710 

Gly Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 

1715 1720 1725 

Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro 

1730 1735 1740 

Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys 
1745 1750 1755 1760 

Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr 

1765 1770 1775 

Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu 

1780 1785 1790 

Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met 

1795 1800 1805 

Tyr Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe 

1810 1815 1820 

Ser He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin 
1825 1830 1835 1840 

He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin He 

1845 1850 1855 

He Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser 

I860 1865 1870 

pro Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val 

1875 1880 1885 

Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 

1890 ^ 1895 1900 

Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe 
1905 1910 1915 1920 

Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala 
1925 1930 1935 
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Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly 

1940 1945 1950 

Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu * 

1955 I960 1965 

Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn 
1970 1975 1980 

Arg 
1985 

<210> 2 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Non-optimized cDNA sequence encoding SEQ. 
1 



ID. NO. 



<400> 2 

gccaccatgg 

atcactagcc 

accgcaacac 

ggtgctggct 

gtggaccagg 

acctgtggca 

cggcggggcg 

tcttcgggtg 

gtatgcaccc 

actatgcggt 

caagtggccc 

tatgcagccc 

tttggggcgt 

accattacca 

ggttgctctg 

actacaatct 

gtcgtgctcg 

gaggtggccc 

gccatcaggg 

gccgcaaagc 

tccgtcatac 

tatacgggcg 

ttcagcttgg 

cgctcgcagc 

ccgggagaac 

ggctgtgctt 

aacacaccag 

ggcctcaccc 

ccctacctgg 

tgggatcaaa 

ttgctgtaca 

tacatcatgg 

ggcggagtcc 

ggtaggatta 

gagttcgatg 

ctcgccgagc 



cgcccatcac 
ttacaggccg 
aatccttcct 
caaagacctt 
acctcgtcgg 
gctcagacct 
acagtagggg 
gtccactgct 

ggggggttgc 
ctccggtctt 
acctacacgc 
aagggtacaa 
atatgtctaa 
caggcgcccc 
ggggcgctta 
tgggcatcgg 
ccaccgctac 
tgtctaatac 
ggggaaggca 
tgtcaggcct 
caactatcgg 
actttgactc 
atcccacctt 
ggcggggtag 
ggccctcggg 
ggtacgagct 
ggttgcccgt 
acatagatgc 
tagcatacca 
tgtggaagtg 
ggctgggagc 
catgcatgtc 
ttgcagctct 
tcttgtccgg 
aaatggaaga 
aattcaagca 



ggcctactcc 

ggacaagaac 

ggcgacctgc 

agccggccca 

ctggcaggcg 

ttacttggtc 

gagcctgctc 

ctgcccttcg 

gaaggcggtg 

cacggacaac 

tcccactggc 

ggtgctcgtc 

ggcacacggt 

cgtcacatac 

tgacatcata 

cacagtcctg 

gcctccggga 

tggagagatc 

tctcattttc 

cggaatcaac 

agacgtcgtt 

agtgatcgac 

caccattgag 

gactggcagg 

catgttcgat 

cacccccgcc 

ttgccaggac 

acacttcttg 

agccacggtg 

tctcatacgg 

cgtccaaaat 

ggctgacctg 

ggccgcgtat 

gaggccggct 

gtgcgcctcg 

gaaagcgctc 



caacagacgc 

caggtcgagg 

gtcaacggcg 

aaggggccaa 

ccccccgggg 

acgagacatg 

tcccccaggc 

gggcacgctg 

gactttgtgc 

tcatcccccc 

agcggcaaga 

ctcaatccgt 

attgacccca 

tctacctatg 

atatgtgatg 

gaccaagcgg 

tcggtcaccg 

cccttctatg 

tgtcattcca 

gctgtggcgt 

gtcgtggcaa 

tgtaacacat 

acgacgaccg 

ggtaggagag 

tcctcggtcc 

gagacctcgg 

cacctggagt 

tcccagacca 

tgcgccaggg 

ctgaaaccta 

gaggtcaccc 

gaggtcgtca 

tgcctgacaa 

attgttcccg 

cacctccctt 

gggttactgc 



ggggcctact 

gagaggttca 

tgtgttggac 

tcacccagat 

cgcgttcctt 

ctgacgtcat 

ctgtctccta 

tgggcatctt 

ccgtagagtc 

cggccgtacc 

gtactaaagt 

ccgttgccgc 

acatcagaac 

gcaagtttct 

agtgccattc 

agacggctgg 

tgccacaccc 

gcaaagccat 

agaagaagtg 

attaccgggg 

cagacgctct 

gtgtcaccca 

tgcctcaaga 

gcatctacag 

tgtgtgagtg 

ttaggttgcg 

tctgggagag 

agcaggcagg 

ctcaggcccc 

cgctgcacgg 

tcacccaccc 

ctagcacctg 

caggcagtgt 

acagggagtt 

acatcgagca 

aaacagccac 



tggttgcatc. 

ggtggtttcc 

cgtttaccat 

gtacactaat 

gacaccatgc 

tccggtgcgc 

cttgaagggc 

ccgggctgcc 

catggaaact 

gcagtcattt 

gccggctgca 

taccttaggg 

tggggtaagg 

tgccgatggt 

aactgactcg 

agcgcggctt 

aaacatcgag 

ccccattgaa 

cgacgagctc 

gctcgatgtg 

gatgacgggc 

gacagtcgac 

cgcagtgtcg 

gtttgtgact 

ctatgacgcg 

ggcctacctg 

tgtcttcaca 

agacaacttc 

acctccatca 

gccaacaccc 

cataaccaaa 

ggtgctggtg 

ggtcattgtg 

tctctaccag 

gggaatgcag 

caaacaagcg 



60 
120 
180 
240 
300 
360 
420 
480 
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1200 
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1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



6/64 



WO 03/031588 



PCT/US02/32512 



gaggctgctg ctcccgtggt ggagtccaag tggcgagccc ttgagacatt ctgggcgaag 2220 

cacatgtgga atttcatcag cgggatacag tacttagcag gcttatccac tctgcctggg 2280 

aaccccgcaa tagcatcatt gatggcattc acagcctcta tcaccagccc gctcaccacc 2340 

caaagtaccc tcctgtttaa catcttgggg gggtgggtgg ctgcccaact cgcccccccc 2400 

agcgccgctt cggctttcgt gggcgccggc atcgccggtg cggctgttgg cagcataggc 2460 

cttgggaagg tgcttgtgga cattctggcg ggttatggag caggagtggc cggcgcgctc 2520 . 

gtggccttca aggtcatgag cggcgagatg ccctccaccg aggacctggt caatctactt 2580 

cctgccatcc tctctcctgg cgccctggtc gtcggggtcg tgtgtgcagc aatactgcgt 2640 

cgacacgtgg gtccgggaga gggggctgtg cagtggatga accggctgat agcgttcgcc 2700 

tcgcggggta atcatgtttc ccccacgcac tatgtgcctg agagcgacgc cgcagcgcgt 2760 

gttactcaga tcctctccag ccttaccatc actcagctgc tgaaaaggct ccaccagtgg 2820 

attaatgaag actgctccac accgtgttcc ggctcgtggc taagggatgt ttgggactgg 2880 

atatgcacgg tgttgactga cttcaagacc tggctccagt ccaagctcct gccgcagcta 2940 

ccgggagtcc cttttttctc gtgccaacgc gggtacaagg gagtctggcg gggagacggc 3000 

atcatgcaaa ccacctgccc atgtggagca cagatcaccg gacatgtcaa aaacggttcc 3060 

atgaggatcg tcgggcctaa gacctgcagc aacacgtggc atggaacatt ccccatcaac 3120 

gcatacacca cgggcccctg cacaccctct ccagcgccaa actattctag ggcgctgtgg 3180 

cgggtggccg ctgaggagta cgtggaggtc acgcgggtgg gggatttcca ctacgtgacg 3240 

ggcatgacca ctgacaacgt aaagtgccca tgccaggttc cggctcctga attcttcacg 3300 

gaggtggacg gagtgcggtt gcacaggtac gctccggcgt gcaggcctct cctacgggag 3360 

gaggttacat tccaggtcgg gctcaaccaa tacctggttg ggtcacagct accatgcgag 3420 

cccgaaccgg atgtagcagt gctcacttcc atgctcaccg acccctccca catcacagca 3480 

gaaacggcta agcgtaggtt ggccaggggg tctcccccct ccttggccag ctcttcagct 3540 

agccagttgt ctgcgccttc cttgaaggcg acatgcacta cccaccatgt ctctccggac 3600 

gctgacctca tcgaggccaa cctcctgtgg cggcaggaga tgggcgggaa catcacccgc 3660 

gtggagtcgg agaacaaggt ggtagtcctg gactctttcg acccgcttcg agcggaggag 3720 

gatgagaggg aagtatccgt tccggcggag atcctgcgga aatccaagaa gttccccgca 3780 

gcgatgccca tctgggcgcg cccggattac aaccctccac tgttagagtc ctggaaggac 3840 

ccggactacg tccctccggt ggtgcacggg tgcccgttgc cacctatcaa ggcccctcca 3900 

ataccacctc cacggagaaa gaggacggtt gtcctaacag agtcctccgt gtcttctgcc 3960 

ttagcggagc tcgctactaa gaccttcggc agctccgaat catcggccgt cgacagcggc 4020 

acggcgaccg cccttcctga ccaggcctcc gacgacggtg acaaaggatc cgacgttgag 4080 

tcgtactcct ccatgccccc ccttgagggg gaaccggggg accccgatct cagtgacggg 4140 

tcttggtcta ccgtgagcga ggaagctagt gaggatgtcg tctgctgctc aatgtcctac 4200 

acatggacag gcgccttgat cacgccatgc gctgcggagg aaagcaagct gcccatcaac 4260 

gcgttgagca actctttgct gcgccaccat aacatggttt atgccacaac atctcgcagc 4320 

gcaggcctgc ggcagaagaa ggtcaccttt gacagactgc aagtcctgga cgaccactac 4380 

cgggacgtgc tcaaggagat gaaggcgaag gcgtccacag ttaaggctaa actcctatcc 4440 

gtagaggaag cctgcaagct gacgccccca cattcggcca aatccaagtt tggctatggg 4500 

gcaaaggacg tccggaacct atccagcaag gccgttaacc acatccactc cgtgtggaag 4560 

gacttgctgg aagacactgt gacaccaatt gacaccacca tcatggcaaa aaatgaggtt 4620 

ttctgtgtcc aaccagagaa aggaggccgt aagccagccc gccttatcgt attcccagat 46B0 

ctgggagtcc gtgtatgcga gaagatggcc ctctatgatg tggtctccac ccttcctcag 4740 

gtcgtgatgg gctcctcata cggattccag tactctcctg ggcagcgagt cgagttcctg 4800 

gtgaatacct ggaaatcaaa gaaaaacccc atgggctttt catatgacac tcgctgtttc 4860 

gactcaacgg tcaccgagaa cgacatccgt gttgaggagt caatttacca atgttgtgac 4920 

ttggcccccg aagccagaca ggccataaaa tcgctcacag agcggcttta tatcgggggt 4980 

cctctgacta attcaaaagg gcagaactgc ggttatcgcc ggtgccgcgc gagcggcgtg 5040 

ctgacgacta gctgcggtaa caccctcaca tgttacttga aggcctctgc agcctgtcga 5100 

gctgcgaagc tccaggactg cacgatgctc gtgaacgccg ccggccttgt cgttatctgt 5160 

gaaagcgcgg gaacccaaga ggacgcggcg agcctacgag tcttcacgga ggctatgact 5220 

aggtactctg ccccccccgg ggacccgccc caaccagaat acgacttgga gctgataaca 5280 

tcatgttcct ccaatgtgtc ggtcgcccac gatgcatcag gcaaaagggt gtactacctc 5340 

acccgtgatc ccaccacccc cctcgcacgg gctgcgtggg aaacagctag acacactcca 5400 

gttaactcct ggctaggcaa cattatcatg tatgcgccca ctttgtgggc aaggatgatt 5460 
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ctgatgactc 
tgccagatct 
cgactccatg 
gtggcttcat 
aggagcgtcc 
ctcttcaact 
ctggacttgt 
tctcgtgccc 
atctacctgc 



acttcttctc 
acggggcctg 
gccttagcgc 
gcctcaggaa 
gcgctaggct 
gggcagtgaa 
ccggctggtt 
gaccccgctg 
tccccaaccg 



catccttcta 
ttactccatt 
attttcactc 
acttggggta 
actgtcccag 
gaccaaactc 
cgttgctggt 
gttcatgctg 
ataaa 



gcacaggagc 
gagccacttg 
catagttact 
ccacccttgc 
ggggggaggg 
aaactcactc 
tacagcgggg 
tgcctactcc 



aacttgaaaa 
acctacctca 
ctccaggtga 
gagtctggag 
ccgccacttg 
caatcccggc 
gagacatata 
tactttctgt 



agccctggac 
gatcattgaa 
gatcaatagg 
acatcgggcc 
tggcaagtac 
tgcgtcccag 
tcacagcctg 
aggggtaggc 



<210> 3 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Optimized cDNA encoding SEQ ID NO: 1 



5520 

5580* 

5640 

5700 

5760 

5820 

5880 

5940 

5965 



<400> 3 

gccaccatgg 

atcaccagcc 

accgccaccc 

ggcgccggca 

gtggaccagg 

acctgcggca 

cgccgcggcg 

agcagcggcg 

gtgtgcaccc 

accatgcgca 

caggtggccc 

. tacgccgccc 
ttcggcgcct 
accatcacca 
ggctgcagcg 
accaccatcc 
gtggtgctgg 
gaggtggccc 
gccatccgcg 
gccgccaagc 
agcgtgatcc 
tacaccggcg 

. ttcagcctgg 
cgcagccagc 
cccggcgagc 

. ggctgcgcct 
aacacccccg 
ggcctgaccc 
ccctacctgg 
tgggaccaga 
ctgctgtacc 
tacatcatgg 
ggcggcgtgc 
ggccgcatca 
gagttcgacg 
ctggccgagc 



cccccatcac 
tgaccggccg 
agagcttcct 
gcaagaccct 
acctggtggg 
gcagcgacct 
acagccgcgg 
gccccctgct 
gcggcgtggc 
gccccgtgtt 
acctgcacgc 
agggctacaa 
acatgagcaa 
ccggcgcccc 
gcggcgccta 
tgggcatcgg 
ccaccgccac 
tgagcaacac 
gcggccgcca 
tgageggcct 
ccaccatcgg 
acttcgacag 
accccacctt 
gccgcggccg 
gccccagcgg 
ggtacgagct 
gcctgcccgt 
acatcgacgc 
tggcctacca 
tgtggaagtg 
gcctgggcgc 
cctgcatgag 
tggccgccct 
tcctgagcgg 
agatggagga 
agttcaagca 



cgcctacagc 
cgacaagaac 
ggccacctgc 
ggccggcccc 
ctggcaggcc 
gtacctggtg 
cagcctgctg 
gtgccccagc 
caaggccgtg 
caccgacaac 
ccccaccggc 
ggtgctggtg 
ggcccacggc 
cgtgacctac 
cgacatcatc 
caccgtgctg 
cccccccggc 
cggcgagatc 
cctgatcttc 
gggcatcaac 
cgacgtggtg 
cgtgatcgac 
caccatcgag 
caccggccgc 
catgttcgac 
gacccccgcc 
gtgccaggac 
ccacttcctg 
ggccaccgtg 
cctgatccgc 
cgtgcagaac 
cgccgacctg 
ggccgcctac 
ccgccccgcc 
gtgcgccagc 
gaaggccctg 



cagcagaccc 
caggtggagg 
gtgaacggcg 
aagggcccca 
ccccccggcg 
acccgccacg 
agcccccgcc 
ggccacgccg 
gacttcgtgc 
agcagccccc 
agcggcaaga 
ctgaacccca 
atcgacccca 
agcacctacg 
atctgcgacg 
gaccaggccg 
agcgtgaccg 
cccttctacg 
tgccacagca 
gccgtggcct 
gtggtggcca 
tgcaacacct 
accaccaccg 
ggccgccgcg 
agcagcgtgc 
gagaccagcg 
cacctggagt 
agccagacca 
tgcgcccgcg 
ctgaagccca 
gaggtgaccc 
gaggtggtga 
tgcctgacca 
atcgtgcccg 
cacctgccct 
ggcctgctgc 



gcggcctgct 
gcgaggtgca 
tgtgctggac 
tcacccagat 
cccgcagcct 
ccgacgtgat 
ccgtgagcta 
tgggcatctt 
ccgtggagag 
ccgccgtgcc 
gcaccaaggt 
gcgtggccgc 
acatccgcac 
gcaagttcct 
agtgccacag 
agaccgccgg 
tgccccaccc 
gcaaggccat 
agaagaagtg 
actaccgcgg 
ccgacgccct 
gcgtgaccca 
tgccccagga 
gcatctaccg 
tgtgcgagtg 
tgcgcctgcg 
tctgggagag 
agcaggccgg 
cccaggcccc 
ccctgcacgg 
tgacccaccc 
ccagcacctg 
ccggcagcgt 
accgcgagtt 
acatcgagca 
agaccgccac 



gggctgcatc 
ggtggtgagc 
cgtgtaccac 
gtacaccaac 
gaccccctgc 
ccccgtgcgc 
cctgaagggc 
ccgcgccgcc 
catggagacc 
ccagagcttc 
gcccgccgcc 
caccctgggc 
cggcgtgcgc 
ggccgacggc 
caccgacagc 
cgcccgcctg 
caacatcgag 
ccccatcgag 
cgacgagctg 
cctggacgtg 
gatgaccggc 
gaccgtggac 
cgccgtgagc 
cttcgtgacc 
ctacgacgcc 
cgcctacctg 
cgtgttcacc 
cgacaacttc 
cccccccagc 
ccccaccccc 
catcaccaag 
ggtgctggtg 
ggtgatcgtg 
cctgtaccag 
gggcatgcag 
caagcaggcc 



60 
120 
180 
. 240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
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2040 
2100 
2160 
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gaggccgccg cccccgtggt ggagagcaag tggcgcgccc tggagacctt ctgggccaag 2220 

cacatgtgga acttcatcag cggcatccag tacctggccg gcctgagcac cctgcccggc 2280 

aaccccgcca tcgccagcct gatggccttc accgccagca tcaccagccc cctgaccacc 2340 

cagagcaccc tgctgttcaa catcctgggc ggctgggtgg ccgcccagct ggcccccccc 2400 

agcgccgcca gcgccttcgt gggcgccggc atcgccggcg ccgccgtggg cagcatcggc 2460 

ctgggcaagg tgctggtgga catcctggcc ggctacggcg ccggcgtggc cggcgccctg 2520 

gtggccttca aggtgatgag cggcgagatg cccagcaccg aggacctggt gaacctgctg 2580 

cccgccatcc tgagccccgg cgccctggtg gtgggcgtgg tgtgcgccgc catcctgcgc 2640 

cgccacgtgg gccccggcga gggcgccgtg cagtggatga accgcctgat cgccttcgcc 2700 

agccgcggca accacgtgag ccccacccac tacgtgcccg agagcgacgc cgccgcccgc 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac cccctgcagc ggcagctggc tgcgcgacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccagctg 2940 

cccggcgtgc ccttcttcag ctgccagcgc ggctacaagg gcgtgtggcg cggcgacggc 3000 

atcatgcaga ccacctgccc ctgcggcgcc cagatcaccg gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccccaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggcccctg cacccccagc cccgccccca actacagccg cgccctgtgg 3180 

cgcgtggccg ccgaggagta cgtggaggtg acccgcgtgg gcgacttcca ctacgtgacc 3240 

ggcatgacca ccgacaacgt gaagtgcccc tgccaggtgc ccgcccccga gttcttcacc 3300 

gaggtggacg gcgfcgcgcct gcaccgctac gcccccgcct gccgccccct gctgcgcgag 3360 

gaggtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cccgagcccg acgtggccgt gctgaccagc atgctgaccg accccagcca catcaccgcc 3480 

gagaccgcca agcgccgcct ggcccgcggc agccccccca gcctggccag cagcagcgcc 3540 

agccagctga gcgcccccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt gcccgccgag atcctgcgca agagcaagaa gttccccgcc 3780 

gccatgccca tctgggcccg ccccgactac aacccccccc tgctggagag ctggaaggac 3840 

cccgactacg tgccccccgt ggtgcacggc tgccccctgc cccccatcaa ggcccccccc 3900 

atcccccccc cccgccgcaa gcgcaccgtg gtgctgaccg agagcagcgt gagcagcgcc 3960 

ctggccgagc tggccaccaa gaccttcggc agcagcgaga gcagcgccgt ggacagcggc 4020 

accgccaccg ccctgcccga ccaggccagc gacgacggcg acaagggcag cgacgtggag 4080 

agctacagca gcatgccccc cctggagggc gagcccggcg accccgacct gagcgacggc 4140 

agctggagca ccgtgagcga ggaggccagc gaggacgtgg tgtgctgcag catgagctac 4200 

acctggaccg gcgccctgat caccccctgc gccgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gcgccaccac aacatggtgt acgccaccac cagccgcagc 4320 

gccggcctgc gccagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgacgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttccgcgtgc agcccgagaa gggcggccgc aagcccgccc gcctgatcgt gttccccgac 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgccccag 4740 

gtggtgatgg gcagcagcta cggcttccag tacagccccg gccagcgcgt ggagttcctg 4800 

gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac ccgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 4920 

ctggcccccg aggcccgcca ggccatcaag agcctgaccg agcgcctgta catcggcggc 4980 

cccctgacca acagcaaggg ccagaactgc ggctaccgcc gctgccgcgc cagcggcgtg 5040 

ctgaccacca gctgcggcaa caccctgacc tgctacctga aggccagcgc cgcctgccgc 5100 

gccgccaagc tgcaggactg caccatgctg gtgaacgccg ccggcctggt ggtgatctgc 5160 

gagagcgccg gcacccagga ggacgccgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

cgctacagcg ccccccccgg cgaccccccc cagcccgagt acgacctgga gctgatcacc 5280 

agctgcagca gcaacgtgag cgtggcccac gacgccagcg gcaagcgcgt gtactacctg 5340 

acccgcgacc ccaccacccc cctggcccgc gccgcctggg agaccgcccg ccacaccccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccca ccctgtgggc ccgcatgatc 5460 
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ctgatgaccc acttcttcag catcctgctg gcccaggagc agctggagaa ggccctggac 5520 
tgccagatct acggcgcctg ctacagcatc gagcccctgg acctgcccca gatcatcgag ; : 5580: 

cgcctgcacg gcctgagcgc cttcagcctg cacagctaca gccccggcga gatcaaccgc 5640 

gtggccagct gcctgcgcaa gctgggcgtg ccccccctgc gcgtgtggcg ccaccgcgcc 5700 

cgcagcgtgc gcgcccgcct gctgagccag ggcggccgcg ccgccacctg cggcaagtac 5760 

ctgttcaact gggccgtgaa gaccaagctg aagctgaccc ccatccccgc cgccagccag 5820. 

ctggacctga gcggctggtt cgtggccggc tacagcggcg gcgacatcta ccacagcctg 5880 

agccgcgccc gcccccgctg gttcatgctg tgcctgctgc tgctgagcgt gggcgtgggc 5940 

atctacctgc tgcccaaccg ctaaa 5965 

<210> 4 
<211> 37090 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> MRKAd6-NSmut nucleic acid 
<400> 4 

catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt 60 

ttgtgacgtg. gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 120 

gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 180 
gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag - 240 

taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 300 

agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 360 

gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 420 

cgggtcaaag ttggcgtttt attattatag gcggccgcga tccattgcat acgttgtatc 480 

catatcataa tatgtacatt tatattggct catgtccaac attaccgcca tgttgacatt 540 

gattattgac tagttattaa tagtaatcaa ctacggggtc attagttcat agcccatata 600 

tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 660 

cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 720 

attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 780 

atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 840 

atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 900 

tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 960 

actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 1020 

aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 1080 

gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac cgtcagatcg 1140 

cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgggac cgatccagcc 1200 

tccgcggccg ggaacggtgc attggaacgc ggactccccg tgccaagagt gagatctgcc 1260 

accatggcgc ccatcacggc ctactcccaa cagacgcggg gcctacttgg ttgcatcatc 1320 

actagcctta caggccggga caagaaccag gtcgagggag aggttcaggt ggtttccacc 1380 

gcaacacaat ccttcctggc gacctgcgtc aacggcgtgt gttggaccgt ttaccatggt 1440 

gctggctcaa agaccttagc cggcccaaag gggccaatca cccagatgta cactaatgtg 1500 

gaccaggacc tcgtcggctg gcaggcgccc cccggggcgc gttccttgac accatgcacc 1560 

tgtggcagct cagaccttta cttggtcacg agacatgctg acgtcattcc ggtgcgccgg 1620 

cggggcgaca gtagggggag cctgctctcc cccaggcctg tctcctactt gaagggctct 1680 

tcgggtggtc cactgctctg cccttcgggg cacgctgtgg gcatcttccg ggctgccgta 1740 

tgcacccggg gggttgcgaa ggcggtggac tttgtgcccg tagagtccat ggaaactact 1800 

atgcggtctc cggtcttcac ggacaactca tcccccccgg ccgtaccgca gtcatttcaa 1860 

gtggcccacc tacacgctcc cactggcagc ggcaagagta ctaaagtgcc ggctgcatat 1920 

gcagcccaag ggtacaaggt gctcgtcctc aatccgtccg ttgccgctac cttagggttt 1980 

ggggcgtata tgtctaaggc acacggtatt gaccccaaca tcagaactgg ggtaaggacc 2040 

attaccacag gcgcccccgt cacatactct acctatggca agtttcttgc cgatggtggt 2100 

tgctctgggg gcgcttatga catcataata tgtgatgagt gccattcaac tgactcgact 2160 
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acaatcttgg gcatcggcac agtcctggac caagcggaga cggctggagc gcggcttgtc 2220 

gtgctcgcca ccgctacgcc tccgggatcg gtcaccgtgc cacacccaaa catcgaggag 2280 

gtggccctgt ctaatactgg agagatcccc ttctatggca aagccatccc cattgaagcc 2340 

atcagggggg gaaggcatct cattttctgt cattccaaga agaagtgcga cgagctcgcc 2400 

gcaaagctgt caggcctcgg aatcaacgct gtggcgtatt accgggggct cgatgtgtcc 2460 

gtcataccaa ctatcggaga cgtcgttgtc gtggcaacag acgctctgat gacgggctat 2520 

acgggcgact ttgactcagt gatcgactgt aacacatgtg tcacccagac agtcgacttc 2580 

agcttggatc ccaccttcac cattgagacg acgaccgtgc ctcaagacgc agtgtcgcgc 2640 

tcgcagcggc ggggtaggac tggcaggggt aggagaggca tctacaggtt tgtgactccg 2700 

ggagaacggc cctcgggcat gttcgattcc tcggtcctgt gtgagtgcta tgacgcgggc 2760 

tgtgcttggt acgagctcac ccccgccgag acctcggtta ggttgcgggc ctacctgaac 2820 

acaccagggt tgcccgtttg ccaggaccac ctggagttct gggagagtgt cttcacaggc 2880 

ctcacccaca tagatgcaca cttcttgtcc cagaccaagc aggcaggaga caacttcccc 2940 

tacctggtag cataccaagc cacggtgtgc gccagggctc aggccccacc tccatcatgg 3000 

gatcaaatgt ggaagtgtct catacggctg aaacctacgc tgcacgggcc aacacccttg 3060 

ctgtacaggc tgggagccgt ccaaaatgag gtcaccctca cccaccccat aaccaaatac 3120 

atcatggcat gcatgtcggc tgacctggag gtcgtcacta gcacctgggt gctggtgggc 3180 

ggagtccttg cagctctggc cgcgtattgc ctgacaacag gcagtgtggt cattgtgggt 3240 

aggattatct tgtccgggag gccggctatt gttcccgaca gggagtttct ctaccaggag 3300 

ttcgatgaaa tggaagagtg cgcctcgcac ctcccttaca tcgagcaggg aatgcagctc 3360 

gccgagcaat tcaagcagaa agcgctcggg ttactgcaaa cagccaccaa acaagcggag 3420 

gctgctgctc ccgtggtgga gtccaagtgg cgagcccttg agacattctg ggcgaagcac 3480 

atgtggaatt tcatcagcgg gatacagtac ttagcaggct tatccactct gcctgggaac 3540 

cccgcaatag catcattgat ggcattcaca gcctctatca ccagcccgct caccacccaa 3600 

agtaccctcc tgtttaacat cttggggggg tgggtggctg cccaactcgc cccccccagc 3660 

gccgcttcgg ctttcgtggg cgccggcatc gccggtgcgg ctgttggcag cataggcctt 3720 

gggaaggtgc ttgtggacat tctggcgggt tatggagcag gagtggccgg cgcgctcgtg 3780 

gccttcaagg tcatgagcgg cgagatgccc tccaccgagg acctggtcaa tctacttcct 3840 

gccatcctct ctcctggcgc cctggtcgtc ggggtcgtgt gtgcagcaat actgcgtcga 3900 

cacgtgggtc cgggagaggg ggctgtgcag tggatgaacc ggctgatagc gttcgcctcg 3960 

cggggtaatc atgtttcccc cacgcactat gtgcctgaga gcgacgccgc agcgcgtgtt 4020 

actcagatcc tctccagcct taccatcact cagctgctga aaaggctcca ccagtggatt 4080 

aatgaagact gctccacacc gtgttccggc tcgtggctaa gggatgtttg ggactggata 4140 

tgcacggtgt tgactgactt caagacctgg ctccagtcca agctcctgcc gcagctaccg 4200 

ggagtccctt ttttctcgtg ccaacgcggg tacaagggag tctggcgggg agacggcatc 4260 

atgcaaacca cctgcccatg tggagcacag atcaccggac atgtcaaaaa cggttccatg 4320 

aggatcgtcg ggcctaagac ctgcagcaac acgtggcatg gaacattccc catcaacgca 4380 

tacaccacgg gcccctgcac accctctcca gcgccaaact attctagggc gctgtggcgg 4440 
gtggccgctg aggagtacgt ggaggtcacg cgggtggggg atttccacta cgtgacgggc 4500 
atgaccactg acaacgtaaa gtgcccatgc caggttccgg ctcctgaatt cttcacggag 4560 
gtggacggag tgcggttgca caggtacgct ccggcgtgca ggcctctcct acgggaggag 4620 
gttacattcc aggtcgggct caaccaatac ctggttgggt cacagctacc atgcgagccc 4680 
gaaccggatg tagcagtgct cacttccatg ctcaccgacc cctcccacat cacagcagaa 4740 
acggctaagc gtaggttggc cagggggtct cccccctcct tggccagctc ttcagctagc 4800 
cagttgtctg cgccttcctt gaaggcgaca tgcactaccc accatgtctc tccggacgct 4860 
gacctcatcg aggccaacct cctgtggcgg caggagatgg gcgggaacat cacccgcgtg 4920 
gagtcggaga acaaggtggt agtcctggac tctttcgacc cgcttcgagc ggaggaggat 4980 
gagagggaag tatccgttcc ggcggagatc ctgcggaaat ccaagaagtt ccccgcagcg 5040 
atgcccatct gggcgcgccc ggattacaac cctccactgt tagagtcctg gaaggacccg 5100 
gactacgtcc ctccggtggt gcacgggtgc .ccgttgccac ctatcaaggc ccctccaata 5160 
ccacctccac ggagaaagag gacggttgtc ctaacagagt cctccgtgtc ttctgcctta 5220 
gcggagctcg ctactaagac cttcggcagc tccgaatcat cggccgtcga cagcggcacg 5280 
gcgaccgccc ttcctgacca ggcctccgac gacggtgaca aaggatccga cgttgagtcg 5340 
tactcctcca tgccccccct tgagggggaa ccgggggacc ccgatctcag tgacgggtct 5400 
tggtctaccg tgagcgagga agctagtgag gatgtcgtct gctgctcaat gtcctacaca 5460 
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tggacaggcg ccttgatcac gccatgcgct gcggaggaaa gcaagctgcc catcaacgcg 5520 

ttgagcaact ctttgctgcg ccaccataac atggtttatg ccacaacatc tcgcagcgca, 5580. 

ggcctgcggc agaagaaggt cacctttgac agactgcaag tcctggacga ccactaccgg 5640' 

gacgtgctca aggagatgaa ggcgaaggcg tccacagtta aggctaaact cctatccgta 5700 

gaggaagcct gcaagctgac gcccccacat tcggccaaat ccaagtttgg ctatggggca 5760 

aaggacgtcc ggaacctatc cagcaaggcc gttaaccaca tccactccgt gtggaaggac 5820 

ttgctggaag acactgtgac accaattgac accaccatca tggcaaaaaa tgaggttttc 5880 

tgtgtccaac cagagaaagg aggccgtaag ccagcccgcc ttatcgtatt cccagatctg 5940 

ggagtccgtg tatgcgagaa gatggccctc tatgatgtgg tctccaccct tcctcaggtc 6000 

gtgatgggct cctcatacgg attccagtac tctcctgggc agcgagtcga gttcctggtg 6060 

aatacctgga aatcaaagaa aaaccccatg ggcttttcat atgacactcg ctgtttcgac 6120 

tcaacggtca ccgagaacga catccgtgtt gaggagtcaa tttaccaatg ttgtgacttg 6180 

gcccccgaag ccagacaggc cataaaatcg ctcacagagc ggctttatat cgggggtcct 6240 

ctgactaatt caaaagggca gaactgcggt tatcgccggt gccgcgcgag cggcgtgctg 6300- 

acgactagct gcggtaacac cctcacatgt tacttgaagg cctctgcagc ctgtcgagct . 6360 

gcgaagctcc aggactgcac gatgctcgtg aacgccgccg gccttgtcgt tatctgtgaa 6420 

agcgcgggaa cccaagagga cgcggcgagc ctacgagtct tcacggaggc tatgactagg 6480 

tactctgccc cccccgggga cccgccccaa ccagaatacg acttggagct gataacatca 6540 

tgttcctcca atgtgtcggt cgcccacgat gcatcaggca aaagggtgta ctacctcacc 6600. 

cgtgatccca ccacccccct cgcacgggct gcgtgggaaa cagctagaca cactccagtt 6660 

aactcctggc taggcaacat tatcatgtat gcgcccactt tgtgggcaag gatgattctg 6720 

atgactcact tcttctccat ccttctagca caggagcaac ttgaaaaagc cctggactgc 6780 

cagatctacg gggcctgtta ctccattgag ccacttgacc tacctcagat. cattgaacga 6840 

ctccatggcc ttagcgcatt ttcactccat agttactctc caggtgagat caatagggtg 6900 

gcttcatgcc tcaggaaact tggggtacca cccttgcgag tctggagaca tcgggccagg 6960 

agcgtccgcg ctaggctact gtcccagggg gggagggccg ccacttgtgg caagtacctc 7020 

ttcaactggg cagtgaagac caaactcaaa ctcactccaa tcccggctgc gtcccagctg 7080 

gacttgtccg gctggttcgt tgctggttac agcgggggag acatatatca cagcctgtct 7140 

cgtgcccgac cccgctggtt catgctgtgc ctactcctac tttctgtagg ggtaggcatc 7200 

tacctgctcc ccaaccggta aatctagagc tgtgccttct agttgccagc catctgttgt 7260 

ttgcccctcc cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta 7320 

ataaaatgag gaaattgcat cgcattgtct gagtaggtgt cattctattc tggggggtgg 7380 

ggtggggcag gacagcaagg gggaggattg ggaagacaat agcaggcatg ctggggatgc 7440 

ggtgggctct atggccgatc ggcgcgccgt actgaaatgt gtgggcgtgg cttaagggtg 7500 

ggaaagaata tataaggtgg gggtcttatg tagttttgta tctgttttgc agcagccgcc 7560 

gccgccatga gcaccaactc gtttgatgga agcattgtga gctcatattt gacaacgcgc 7620 

atgcccccat gggccggggt gcgtcagaat gtgatgggct ccagcattga tggtcgcccc 7680 

gtcctgcccg caaactctac taccttgacc tacgagaccg tgtctggaac gccgttggag 7740 

actgcagcct ccgccgccgc ttcagccgct gcagccaccg cccgcgggat tgtgactgac 7800 

tttgctttcc tgagcccgct tgcaagcagt gcagcttccc gttcatccgc ccgcgatgac 7860 

aagttgacgg ctcttttggc acaattggat tctttgaccc gggaacttaa tgtcgtttct 7920 

cagcagctgt tggatctgcg ccagcaggtt tctgccctga aggcttcctc ccctcccaat 7980 

gcggtttaaa acataaataa aaaaccagac tctgtttgga tttggatcaa gcaagtgtct 8040 

tgctgtcttt atttaggggt tttgcgcgcg cggtaggccc gggaccagcg gtctcggtcg 8100 

ttgagggtcc tgtgtatttt ttccaggacg tggtaaaggt gactctggat gttcagatac 8160 

atgggcataa gcccgtctct ggggtggagg tagcaccact gcagagcttc atgctgcggg 8220 

gtggtgttgt agatgatcca gtcgtagcag gagcgctggg cgtggtgcct aaaaatgtct 8280 

ttcagtagca agctgattgc caggggcagg cccttggtgt aagtgtttac aaagcggtta 8340 

agctgggatg ggtgcatacg tggggatatg agatgcatct tggactgtat ttttaggttg 8400 

gctatgttcc cagccatatc cctccgggga ttcatgttgt gcagaaccac cagcacagtg 8460 

tatccggtgc acttgggaaa tttgtcatgt agcttagaag gaaatgcgtg gaagaacttg 8520 

gagacgccct tgtgacctcc aagattttcc atgcattcgt ccataatgat ggcaatgggc 8580 

ccacgggcgg cggcctgggc gaagatattt ctgggatcac taacgtcata gttgtgttcc 8640 

aggatgagat cgtcataggc catttttaca aagcgcgggc ggagggtgcc agactgcggt 8700 

ataatggttc catccggccc aggggcgtag ttaccctcac agatttgcat ttcccacgct ' 8760 
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ttgagttcag atggggggat catgtctacc tgcggggcga tgaagaaaac ggtttccggg 8820 

gtaggggaga tcagctggga agaaagcagg ttcctgagca gctgcgactt accgcagccg 8880 

gtgggcccgt aaatcacacc tattaccggc tgcaactggt agttaagaga gctgcagctg 8940 

ccgtcatccc tgagcagggg ggccacttcg ttaagcatgt ccctgactcg catgttttcc 9000 

ctgaccaaat ccgccagaag gcgctcgccg cccagcgata gcagttcttg caaggaagca 9060 

aagtttttca acggtttgag accgtccgcc gtaggcatgc ttttgagcgt ttgaccaagc 9120 

agttccaggc ggtcccacag ctcggtcacc tgctctacgg catctcgatc cagcatatct 9180 

cctcgtttcg cgggttgggg cggctttcgc tgtacggcag tagtcggtgc tcgtccagac 9240 

gggccagggt catgtctttc cacgggcgca gggtcctcgt cagcgtagtc tgggtcacgg 9300 

tgaaggggtg cgctccgggc tgcgcgctgg ccagggtgcg cttgaggctg gtcctgctgg 9360 

tgctgaagcg ctgccggtct tcgccctgcg cgtcggccag gtagcatttg accatggtgt 9420 

catagtccag cccctccgcg gcgtggccct tggcgcgcag cttgcccttg gaggaggcgc 9480 

cgcacgaggg gcagtgcaga cttttgaggg cgtagagctt gggcgcgaga aataccgatt 9540 

ccggggagta ggcatccgcg ccgcaggccc cgcagacggt ctcgcattcc acgagccagg 9600 

tgagctctgg ccgttcgggg tcaaaaacca ggtttccccc atgctttttg atgcgtttct 9660 

tacctctggt ttccatgagc cggtgtccac gctcggtgac gaaaaggctg tccgtgtccc 9720 

cgtatacaga cttgagaggc ctgtcctcga gcggtgttcc gcggtcctcc tcgtatagaa 9780 

actcggacca ctctgagacg aaggctcgcg tccaggccag cacgaaggag gctaagtggg 9840 

aggggtagcg gtcgttgtcc actagggggt ccactcgctc cagggtgtga agacacatgt 9900 

cgccctcttc ggcatcaagg aaggtgattg gtttataggt gtaggccacg tgaccgggtg 9960 

ttcctgaagg ggggctataa aagggggtgg gggcgcgttc gtcctcactc tcttccgcat 10020 

cgctgtctgc gagggccagc tgttggggtg agtactccct ctcaaaagcg ggcatgactt 10080 

ctgcgctaag attgtcagtt tccaaaaacg aggaggattt gatattcacc tggcccgcgg 10140 

tgatgccttt gagggtggcc gcgtccatct ggtcagaaaa gacaatcttt ttgttgtcaa 10200 

gcttggtggc aaacgacccg tagagggcgt tggacagcaa cttggcgatg gagcgcaggg 10260 

tttggttttt gtcgcgatcg gcgcgctcct tggccgcgat gtttagctgc acgtattcgc 10320 

gcgcaacgca ccgccattcg ggaaagacgg tggtgcgctc gtcgggcact aggtgcacgc 10380 

gccaaccgcg gttgtgcagg gtgacaaggt caacgctggt ggctacctct ccgcgtaggc 10440 

gctcgttggt ccagcagagg cggccgccct tgcgcgagca gaatggcggt agtgggtcta 10500 

gctgcgtctc gtccgggggg tctgcgtcca cggtaaagac cccgggcagc aggcgcgcgt 10560 

cgaagtagtc tatcttgcat ccttgcaagt ctagcgcctg ctgccatgcg cgggcggcaa 10620 

gcgcgcgctc gtatgggttg agtgggggac cccatggcat ggggtgggtg agcgcggagg 10680 

cgtacatgcc gcaaatgtcg taaacgtaga ggggctctct gagtattcca agatatgtag 10740 

ggtagcatct tccaccgcgg atgctggcgc gcacgtaatc gtatagttcg tgcgagggag 10800 

cgaggaggtc gggaccgagg ttgctacggg cgggctgctc tgctcggaag actatctgcc 10860 

tgaagatggc atgtgagttg gatgatatgg ttggacgctg gaagacgttg aagctggcgt 10920 

ctgtgagacc taccgcgtca cgcacgaagg aggcgtagga gtcgcgcagc ttgttgacca 10980 

gctcggcggt gacctgcacg tctagggcgc agtagtccag ggtttccttg atgatgtcat 11040 

acttatcctg tccctttttt ttccacagct cgcggttgag gacaaactct tcgcggtctt 11100 

tccagtactc ttggatcgga aacccgtcgg cctccgaacg gtaagagcct agcatgtaga 11160 

actggttgac ggcctggtag gcgcagcatc ccttttctac gggtagcgcg tatgcctgcg 11220 

cggccttccg gagcgaggtg tgggtgagcg caaaggtgtc cctaaccatg actttgaggt 11280 

actggtattt gaagtcagtg tcgtcgcatc cgccctgctc ccagagcaaa aagtccgtgc 11340 

gctttttgga acgcgggttt ggcagggcga aggtgacatc gttgaagagt atctttcccg 11400 

cgcgaggcat aaagttgcgt gtgatgcgga agggtcccgg cacctcggaa cggttgttaa 11460 

ttacctgggc ggcgagcacg atctcgtcaa agccgttgat gttgtggccc acaatgtaaa 11520 

gttccaagaa gcgcgggatg cccttgatgg aaggcaattt tttaagttcc tcgtaggtga 11580 

gctcttcagg ggagctgagc ccgtgctctg aaagggccca gtctgcaaga tgagggttgg 11640 

aagcgacgaa tgagctccac aggtcacggg ccattagcat ttgcaggtgg tcgcgaaagg 11700 

tcctaaactg gcgacctatg gccatttttt ctggggtgat gcagtagaag gtaagcgggt 11760 

cttgttccca gcggtcccat ccaaggtccg cggctaggtc tcgcgcggcg gtcactagag 11820 

gctcatctcc gccgaacttc atgaccagca tgaagggcac gagctgcttc ccaaaggccc 11880 

ccatccaagt ataggtctct acatcgtagg tgacaaagag acgctcggtg cgaggatgcg 11940 

agccgatcgg gaagaactgg atctcccgcc accagttgga ggagtggctg ttgatgtggt 12000 

gaaagtagaa gtccctgcga cgggccgaac actcgtgctg gcttttgtaa aaacgtgcgc 12060 
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agtactggca gcggtgcacg ggctgtacat cctgcacgag gttgacctga cgaccgcgca .12120 

caaggaagca gagtgggaat ttgagcccct cgcctggcgg gtttggctgg tggtcttcta .12180 

cttcggctgc ttgtccttga ccgtctggct gctcgagggg agttacggtg gatcggacca 12240 

ccacgccgcg cgagcccaaa gtccagatgt ccgcgcgcgg cggtcggagc ttgatgacaa 12300 

catcgcgcag atgggagctg tccatggtct ggagctcccg cggcgtcagg tcaggcggga 12360 

gctcctgcag gtttacctcg catagccggg tcagggcgcg ggctaggtcc aggtgatacc 12420 

tgatttccag gggctggttg gtggcggcgt cgatggcttg caagaggccg catccccgcg 12480 

gcgcgactac ggtaccgcgc ggcgggcggt gggccgcggg ggtgtccttg gatgatgcat 12540' 

ctaaaagcgg tgacgcgggc gggcccccgg aggtaggggg ggctcgggac ccgccgggag 12600 

agggggcagg ggcacgtcgg cgccgcgcgc gggcaggagc tggtgctgcg cgcggaggtt 12660 

gctggcgaac gcgacgacgc ggcggttgat ctcctgaatc tggcgcctct gcgtgaagac 12720 

gacgggcccg gtgagcttga acctgaaaga gagttcgaca gaatcaattt cggtgtcgtt 12780 

gacggcggcc tggcgcaaaa tctcctgcac gtctcctgag ttgtcttgat aggcgatctc 12840 

ggccatgaac tgctcgatct cttcctcctg gagatctccg cgtccggctc gctccacggt 12900 

ggcggcgagg tcgttggaga tgcgggccat gagctgcgag aaggcgttga ggcctccctc 12960 

gttccagacg cggctgtaga ccacgccccc ttcggcatcg cgggcgcgca tgaccacctg 13020 

cgcgagattg agctccacgt gccgggcgaa gacggcgtag tttcgcaggc gctgaaagag 13080 

gtagttgagg gtggtggcgg tgtgttctgc cacgaagaag tacataaccc agcgccgcaa 13140 

cgtggattcg ttgatatccc ccaaggcctc aaggcgctcc atggcctcgt agaagtccac 13200 

ggcgaagttg aaaaactggg agttgcgcgc cgacacggtt aactcctcct ccagaagacg 13260 

gatgagctcg gcgacagtgt cgcgcacctc gcgctcaaag gctacagggg cctcttcttc 13320 

ttcttcaatc tcctcttcca taagggcctc cccttcttct tcttctggcg gcggtggggg 13380 

aggggggaca cggcggcgac gacggcgcac cgggaggcgg tcgacaaagc gctcgatcat 13440 

ctccccgcgg cgacggcgca tggtctcggt gacggcgcgg ccgttctcgc gggggcgcag 13500 

ttggaagacg ccgcccgtca tgtcccggtt atgggttggc ggggggctgc cgtgcggcag 13560 

ggatacggcg ctaacgatgc atctcaacaa ttgttgtgta ggtactccgc caccgaggga 13620 

cctgagcgag tccgcatcga ccggatcgga aaacctctcg agaaaggcgt ctaaccagtc 13680 

acagtcgcaa ggtaggctga gcaccgtggc gggcggcagc gggcggcggt cggggttgtt 13740 

tctggcggag gtgctgctga tgatgtaatt aaagtaggcg gtcttgagac ggcggatggt 13800 

cgacagaagc accatgtcct tgggtccggc ctgctgaatg cgcaggcggt cggccatgcc 13860 

ccaggcttcg ttttgacatc ggcgcaggtc tttgtagtag tcttgcatga gcctttctac 13920 

cggcacttct tcttctcctt cctcttgtcc tgcatctctt gcatctatcg ctgcggcggc 13980 

ggcggagttt ggccgtaggt ggcgccctct tcctcccatg cgtgtgaccc cgaagcccct 14040 

catcggctga agcagggcca ggtcggcgac aacgcgctcg gctaatatgg cctgctgcac 14100 

ctgcgtgagg gtagactgga agtcgtccat gtccacaaag cggtggtatg cgcccgtgtt 14160 

gatggtgtaa gtgcagttgg ccataacgga ccagttaacg gtctggtgac ccggctgcga 14220 

gagctcggtg tacctgagac gcgagtaagc ccttgagtca aagacgtagt cgttgcaagt 14280 

ccgcaccagg tactggtatc ccaccaaaaa gtgcggcggc ggctggcggt agaggggcca 14340 

gcgtagggtg gccggggctc cgggggcgag gtcttccaac ataaggcgat. gatatccgta 14400 

gatgtacctg gacatccagg tgatgccggc ggcggtggtg gaggcgcgcg gaaagtcacg 14460 

gacgcggttc cagatgttgc gcagcggcaa aaagtgctcc atggtcggga cgctctggcc 14520 

ggtcaggcgc gcgcagtcgt tgacgctcta gaccgtgcaa aaggagagcc tgtaagcggg 14580 

cactcttccg tggtctggtg gataaattcg caagggtatc atggcggacg accggggttc 14640 

gaaccccgga tccggccgtc cgccgtgatc catgcggtta ccgcccgcgt gtcgaaccca 14700 

ggtgtgcgac gtcagacaac gggggagcgc tccttttggc ttccttccag gcgcggcgga 14760 

tgctgcgcta gcttttttgg ccactggccg cgcgcggcgt aagcggttag gctggaaagc 14820 

gaaagcatta agtggctcgc tccctgtagc cggagggtta ttttccaagg gttgagtcgc 14880 

gggacccccg gttcgagtct cgggccggcc ggactgcggc gaacgggggt ttgcctcccc 14940 

gtcatgcaag accccgcttg caaattcctc cggaaacagg gacgagcccc ttttttgctt 15000 

ttcccagatg catccggtgc tgcggcagat gcgcccccct cctcagcagc ggcaagagca 15060 

agagcagcgg cagacatgca gggcaccctc cccttctcct accgcgtcag gaggggcaac 15120 

atccgcggct gacgcggcgg cagatggtga ttacgaaccc ccgcggcgcc ggacccggca 15180 

ctacttggac ttggaggagg gcgagggcct ggcgcggcta ggagcgccct ctcctgagcg 15240 

acacccaagg gtgcagctga agcgtgacac gcgcgaggcg tacgtgccgc ggcagaacct 15300 

gtttcgcgac cgcgagggag aggagcccga ggagatgcgg gatcgaaagt tccatgcagg 15360 
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gcgcgagttg cggcatggcc tgaaccgcga gcggttgctg cgcgaggagg actttgagcc 15420 

cgacgcgcgg accgggatta gtcccgcgcg cgcacacgtg gcggccgccg acctggtaac 15480 

cgcgtacgag cagacggtga accaggagat taactttcaa aaaagcttta acaaccacgt 15540 

gcgcacgctt gtggcgcgcg aggaggtggc tataggactg atgcatctgt gggactttgt 15600 

aagcgcgctg gagcaaaacc caaatagcaa gccgctcatg gcgcagctgt tccttatagt 15660 

gcagcacagc agggacaacg aggcattcag ggatgcgctg ctaaacatag tagagcccga 15720 

gggccgctgg ctgctcgatt tgataaacat tctgcagagc atagtggtgc aggagcgcag 15780 

cttgagcctg gctgacaagg tggccgccat taactattcc atgctcagtc tgggcaagtt 15840 

ttacgcccgc aagatatacc atacccctta cgttcccata gacaaggagg taaagatcga 15900 

ggggttctac atgcgcatgg cgctgaaggt gcttaccttg agcgacgacc tgggcgttta 15960 

tcgcaacgag cgcatccaca aggccgtgag cgtgagccgg cggcgcgagc tcagcgaccg 16020 

cgagctgatg cacagcctgc aaagggccct ggctggcacg ggcagcggcg atagagaggc 16080 

cgagtcctac tttgacgcgg gcgctgacct gcgctgggcc ccaagccgac gcgccctgga 16140 

ggcagctggg gccggacctg ggctggcggt ggcacccgcg cgcgctggca acgtcggcgg 16200 

cgtggaggaa tatgacgagg acgatgagta cgagccagag gacggcgagt actaagcggt 16260 

gatgtttctg atcagatgat gcaagacgca acggacccgg cggtgcgggc ggcgctgcag 16320 

agccagccgt ccggccttaa ctccacggac gactggcgcc aggtcatgga ccgcatcatg 16380 

tcgctgactg cgcgcaaccc tgacgcgttc cggcagcagc cgcaggccaa ccggctctcc 16440 

gcaattctgg aagcggtggt cccggcgcgc gcaaacccca cgcacgagaa ggtgctggcg 16500 

atcgtaaacg cgctggccga aaacagggcc atccggcccg atgaggccgg cctggtctac 16560 

gacgcgctgc ttcagcgcgt ggctcgttac aacagcagca acgtgcagac caacctggac 16620 

cggctggtgg gggatgtgcg cgaggccgtg gcgcagcgtg agcgcgcgca gcagcagggc 16680 

aacctgggct ccatggttgc actaaacgcc ttcctgagta cacagcccgc caacgtgccg 16740 

cggggacagg aggactacac caactttgtg agcgcactgc ggctaatggt gactgagaca 16800 

ccgcaaagtg aggtgtatca gtccgggcca gactattttt tccagaccag tagacaaggc 16860 

ctgcagaccg taaacctgag ccaggctttc aagaacttgc aggggctgtg gggggtgcgg 16920 

gctcccacag gcgaccgcgc gaccgtgtct agcttgctga cgcccaactc gcgcctgttg 16980 

ctgctgctaa tagcgccctt cacggacagt ggcagcgtgt cccgggacac atacctaggt 17040 

cacttgctga cactgtaccg cgaggccata ggtcaggcgc atgtggacga gcatactttc 17100 

caggagatta caagtgttag ccgcgcgctg gggcaggagg acacgggcag cctggaggca 17160 

accctgaact acctgctgac caaccggcgg caaaaaatcc cctcgttgca cagtttaaac 17220 

agcgaggagg agcgcatttt gcgctatgtg cagcagagcg tgagccttaa cctgatgcgc 17280 

gacggggtaa cgcccagcgt ggcgctggac atgaccgcgc gcaacatgga accgggcatg 17340 

tatgcctcaa accggccgtt tatcaatcgc ctaatggact acttgcatcg cgcggccgcc 17400 

gtgaaccccg agtatttcac caatgccatc ttgaacccgc actggctacc gccccctggt 17460 

ttctacaccg ggggattcga ggtgcccgag ggtaacgatg gattcctctg ggacgacata 17520 

gacgacagcg tgttttcccc gcaaccgcag accctgctag agttgcaaca acgcgagcag 17580 

gcagaggcgg cgctgcgaaa ggaaagcttc cgcaggccaa gcagcttgtc cgatctaggc 17640 

gctgcggccc cgcggtcaga tgctagtagc ccatttccaa gcttgatagg gtctcttacc 177 00 

agcactcgca ccacccgccc gcgcctgctg ggcgaggagg agtacctaaa caactcgctg 17760 

ctgcagccgc agcgcgaaaa gaacctgcct ccggcgtttc ccaacaacgg gatagagagc 17820 

ctagtggaca agatgagtag atggaagacg tatgcgcagg agcacaggga tgtgcccggc 17880 

ccgcgcccgc ccacccgtcg tcaaaggcac gaccgtcagc ggggtctggt gtgggaggac 17940 

gatgactcgg cagacgacag cagcgtcttg gatttgggag ggagtggcaa cccgtttgca 18000 

caccttcgcc ccaggctggg gagaatgttt taaaaaaaag catgatgcaa aataaaaaac 18060 

tcaccaaggc catggcaccg agcgttggtt ttcttgtatt ccccttagta tgcggcgcgc 18120 

ggcgatgtat gaggaaggtc ctcctccctc ctacgagagc gtggtgagcg cggcgccagt 18180 

ggcggcggcg ctgggttcac ccttcgatgc tcccctggac ccgccgttcg tgcctccgcg 18240 

gtacctgcgg cctaccgggg ggagaaacag catccgttac tctgagttgg cacccctatt 18300 

cgacaccacc cgtgtgtacc ttgtggacaa caagtcaacg gatgtggcat ccctgaacta 18360 

ccagaacgac cacagcaact ttctaaccac ggtcattcaa aacaatgact acagcccggg 18420 

ggaggcaagc acacagacca tcaatcttga cgaccggtcg cactggggcg gcgacctgaa 18480 

aaccatcctg cataccaaca tgccaaatgt gaacgagttc atgtttacca ataagtttaa 18540 

ggcgcgggtg atggtgtcgc gctcgcttac taaggacaaa caggtggagc tgaaatacga 18600 

gtgggtggag ttcacgctgc ccgagggcaa ctactccgag accatgacca tagaccttat 18660 
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gaacaacgcg 

cgacatcggg . 

tcttgtcatg 

aggatgcggg 

gcaacccttc 

cgcactgttg 

gggtggcgca 

agctgcggca 

tgccacacgg 

cgctgcggag 

cctgacagag 

ccagtaccgc 

atggaccctg 

gcccgacatg 

ggtggtgggc 

ctactcccag 

gaaccagatt 

tgctctcaca 

gaccattact 

ctcgccgcgc 

cagcaataac 

gcgctccgac 

caaacgcggc 

ggcgcgcaac 

cgtggtgcgc 

tcgccaccgc 

cgcacgtcgc 

tgtcactgtg 

tgctatgact 

gcgcgtgccc 

ctcgtactgt 

aatcaaagaa 

agagcaggat 

tgatgatgaa 

acagtggaaa 

gcccggtgag 

ggacctgctt 

ggacatgctg 

actgcagcag 

gtctggtgac 

tgtcttggaa 

caagcaggtg 

tagcactagt 

ggcggtggca 

ggtgcaaacg 

gaagtacggc 

tacccccggc 

aaccaccact 

cgtgcgcagg 

ccccagcatc 

cctccgtttc 

ccacggcctg 

tcgcatgcgc 

cgtgcccgga 

catgtggaaa 



atcgtggagc actacttgaa 
gtaaagtttg acacccgcaa 
cctggggtat atacaaacga 
gtggacttca cccacagccg 
caggagggct ttaggatcac 
gatgtggacg cctaccaggc 
ggcggcggca acaacagtgg 
atgcagccgg tggaggacat 
gcggaggaga agcgcgctga 
gctgcacaac ccgaggtcga 
gacagcaaga aacgcagtta 
agctggtacc ttgcatacaa 
ctttgcactc ctgacgtaac 
atgcaagacc ccgtgacctt 
gccgagctgt tgcccgtgca 
ctcatccgcc agtttacctc 
ttggcgcgcc cgccagcccc 
gatcacggga cgctaccgct 
gacgccagac gccgcacctg 
gtcctatcga gccgcacttt 
acaggctggg gcctgcgctt 
caacacccag tgcgcgtgcg 
cgcactgggc gcaccaccgt 
tacacgccca cgccgccgcc 
ggagcccggc gctacgctaa 
cgccgacccg gcactgccgc 
accggccgac gggcggccat 
ccccccaggt ccaggcgacg 
cagggtcgca ggggcaacgt 
gtgcgcaccc gccccccgcg 
tgtatgtatc cagcggcggc 
gagatgctcc aggtcatcgc 
tacaagcccc gaaagctaaa 
cttgacgacg aggtggaact 
ggtcgacgcg taagacgtgt 
cgctccaccc gcacctacaa 
gagcaggcca acgagcgcct 
gcgttgccgc tggacgaggg 
gtgctgcccg cgcttgcacc 
ttggcaccca ccgtgcagct 
aaaatgaccg tggagcctgg 
gcaccgggac tgggcgtgca 
attgccactg ccacagaggg 
gatgccgcgg tgcaggcggc 
gacccgtgga tgtttcgtgt 
gccgccagcg cgctactgcc 
tatcgtggct acacctaccg 
ggaacccgcc gccgccgtcg 
gtggctcgcg aaggaggcag 
gtttaaaagc cggtctttgt 
ccggtgccgg gattccgagg 
acgggcggca tgcgtcgtgc 
ggcggtatcc tgcccctcct 
attgcatccg tggccttgca 
aatcaaaata aaagtctgga 



agtgggcagg 
cttcagactg 
agccttccat 
cctgagcaac 
ctacgatgac 
aagcttgaaa 
cagcggcgcg 
gaacgatcat 
ggccgaggca 
gaagcctcag 
caacctaata 
ctacggcgac 
ctgcggctcg 
ccgctccacg 
ctccaagagc 
tctgacccac 
caccatcacc 
gcgcaacagc 
cccctacgtt 
ttgagcaagc 
cccaagcaag 
cgggcactac 
cgatgacgcc 
agtgtccacc 
aatgaagaga 
ccaacgcgcg 
gcgagccgct 
agcggccgcc 
gtactgggtg 
caactagatt 
ggcgcgcatc 
gccggagatc 
gcgggtcaaa 
gttgcacgcg 
tttgcgaccc 
gcgcgtgtat 
cggggagttt 
caacccaaca 
gtccgaagaa 
gatggtaccc 
gctggagccc 
gaccgtggac 
catggagaca 
cgctgcggcc 
ttcagccccc 
cgaatatgcc 
ccccagaaga 
ccgtcgccag 
gaccctggtg 
ggttcttgca 
aagaatgcac 
gcaccaccgg 
tattccactg 
ggcgcagaga 
ctctcacgct 



cagaacgggg 
gggtttgacc 
ccagacatca 
ttgttgggca 
ctggagggtg 
gatgacaccg 
gaagagaact 
gccattcgcg 
gcggccgaag 
aagaaaccgg 
agcaatgaca 
cctcaggccg 
gagcaggtat 
cgccagatca 
ttctacaacg 
gtgttcaatc 
accgtcagtg 
atcggaggag 
tacaaggccc 
atgtccatcc 
atgtttggcg 
cgcgcgccct 
atcgacgcgg 
gtggacgcgg 
cggcggaggc 
gcggcggccc 
cgaaggctgg 
gcagcagccg 
cgcgactcgg 
gcaataaaaa 
gaagctatgt 
tatggccccc 
aagaaaaaga 
accgcgccca 
ggcaccaccg 
gatgaggtgt 
gcctacggaa 
cctagcctaa 
aagcgcggcc 
aagcgtcagc 
gaggtccgcg 
gttcagatac 
caaacgtccc 
gcgtccaaga 
cggcgtccgc 
ctacatcctt 
cgagcaacta 
cccgtgctgg 
ctgccaacag 
gatatggccc 
cgtaggaggg 
cggcggcgcg 
atcgccgcgg 
cactgattaa 
cgcttggtcc 



ttctggaaag 

cagtcactgg 

ttttgctgcc 

tccgcaagcg 

gtaacattcc 

aacagggcgg 

ccaacgcggc 

gcgacacctt 

ctgccgcccc 

tgattaaacc 

gcaccttcac 

ggatccgctc 

actggtcgtt 

gcaactttcc 

accaggccgt 

gctttcccga 

aaaacgttcc 

tccagcgagt 

tgggcatagt 

ttatatcgcc 

gggccaagaa 

ggggcgcgca 

tggtggagga 

ccattcagac 

gcgtagcacg 

tgcttaaccg 

ccgcgggtat 

cggccattag 

tcagcggcct 

actacttaga 

ccaagcgcaa 

cgaagaagga 

aagatgatga 

ggcgacgggt 

tagtctttac 

acggcgacga 

agcggcataa 

agcccgtgac 

taaagcgcga 

gactggaaga 

tgcggccaat 

ccaccaccag 

cggttgcctc 

cctctacgga 

gccgttcaag 

ccatcgcgcc 

cccgacgccg 

ccccgatttc 

cgcgctacca 

tcacctgccg 

gcatggccgg 

cgtcgcaccg 

cgattggcgc 

aaacaagtta 

tgtaactatt 



18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 . 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
. 20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
2i060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
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ttgtagaatg gaagacatca actttgcgtc actggccccg cgacacggct cgcgcccgtt 22020 

catgggaaac tggcaagata tcggcaccag caatatgagc ggtggcgcct tcagctgggg 22080 

ctcgctgtgg agcggcatta aaaatttcgg ttccgccgtt aagaactatg gcagcaaagc 22140 

ctggaacagc agcacaggcc agatgctgag ggacaagttg aaagagcaaa atttccaaca 22200 

aaaggtggta gatggcctgg cctctggcat tagcggggtg gtggacctgg ccaaccaggc 22260 

agtgcaaaat aagattaaca gtaagcttga tccccgccct cccgtagagg agcctccacc 22320 

ggccgtggag acagtgtctc cagaggggcg tggcgaaaag cgtccgcgac ccgacaggga 22380 

agaaactctg gtgacgcaaa tagacgagcc tccctcgtac gaggaggcac taaagcaagg 22440 

cctgcccacc acccgtccca tcgcgcccat ggctaccgga gtgctgggcc agcacacacc 22500 

cgtaacgctg gacctgcctc cccccgccga cacccagcag aaacctgtgc tgccaggccc 22560 

gtccgccgtt gttgtaaccc gtcctagccg cgcgtccctg cgccgcgccg ccagcggtcc 22620 

gcgatcgttg cggcccgtag ccagtggcaa ctggcaaagc acactgaaca gcatcgtggg 22680 

tttgggggtg caatccctga agcgccgacg atgcttctga tagctaacgt gtcgtatgtg 22740 

tgtcatgtat gcgtccatgt cgccgccaga ggagctgctg agccgccgcg cgcccgcttt 22800 

ccaagatggc taccccttcg atgatgccgc agtggtctta catgcacatc tcgggccagg 22860 

acgcctcgga gtacctgagc cccgggctgg tgcagttcgc ccgcgccacc gagacgtact 22920 

tcagcctgaa taacaagttt agaaacccca cggtggcgcc tacgcacgac gtgaccacag 22980 

accggtctca gcgtttgacg ctgcggttca tccccgtgga ccgcgaggat actgcgtact 23040 

cgtacaaggc gcggttcacc ctagctgtgg gtgataaccg tgtgctagac atggcttcca 23100 

cgtactttga catccgcggc gtgctggaca ggggccctac ttttaagccc tactctggca 23160 

ctgcctacaa cgcactggcc cccaagggtg cccccaactc gtgcgagtgg gaacaaaatg 23220 

aaactgcaca agtggatgct caagaacttg acgaagagga gaatgaagcc aatgaagctc 23280 

aggcgcgaga acaggaacaa gctaagaaaa cccatgtata tgcccaggct ccactgtccg 23340 

gaataaaaat aactaaagaa ggtctacaaa taggaactgc cgacgccaca gtagcaggtg 23400 

ccggcaaaga aattttcgca gacaaaactt ttcaacctga accacaagta ggagaatctc 23460 

aatggaacga agcggatgcc acagcagctg gtggaagggt tcttaaaaag acaactccca 23520 

"tgaaaccctg ctatggctca tacgctagac ccaccaattc caacggcgga cagggcgtta 23580 

tggttgaaca aaatggtaaa ttggaaagtc aagtcgaaat gcaatttttt tccacatcca 23640 

caaatgccac aaatgaagtt aacaatatac aaccaacagt tgtattgtac agcgaagatg 23700 

taaacatgga aactccagat actcatcttt cttataaacc taaaatgggg gataaaaatg 23760 

ccaaagtcat gcttggacaa caagcaatgc caaacagacc aaattacatt gcttttagag 23820 

acaattttat tggtctcatg tattacaaca gcacaggtaa catgggtgtc cttgctggtc 23880 

aggcatcgca gttgaacgct gttgtagatt tgcaagacag aaacacagag ctgtcctacc 23940 

agcttttgct tgattcaatt ggcgacagaa caagatactt ttcaatgtgg aatcaagctg 24000 

ttgacagcta tgatccagat gtcagaatta ttgagaacca tggaactgag gatgagttgc 24060 

caaattattg ctttcctctt ggtggaattg ggattactga cacttttcaa gctgttaaaa 24120 

caactgctgc taacggggac caaggcaata ctacctggca aaaagattca acatttgcag 24180 

aacgcaatga aataggggtg ggaaataact ttgccatgga aattaacctg aatgccaacc 24240 

tatggagaaa tttcctttac tccaatattg cgctgtacct gccagacaag ctaaaataca 24300 

accccaccaa tgtggaaata tctgacaacc ccaacaccta cgactacatg aacaagcgag 24360 

tggtggctcc tgggcttgta gactgctaca ttaaccttgg ggcgcgctgg tctctggact 24420 

acatggacaa cgttaatccc tttaaccacc accgcaatgc gggcctgcgt taccgctcca 24480 

tgttgttggg aaacggccgc tacgtgccct ttcacattca ggtgccccaa aagttttttg 24540 

ccattaaaaa cctcctcctc ctgccaggct catacacata tgaatggaac ttcaggaagg 24600 

atgttaacat ggttctgcag agctctctgg gaaacgacct tagagttgac ggggctagca 24660 

ttaagtttga cagcatttgt ctttacgcca ccttcttccc catggcccac aacacggcct 24720 

ccacgctgga agccatgctc agaaatgaca ccaacgacca gtcctttaat gactaccttt 24780 

ccgccgccaa catgctatat cccatacccg ccaacgccac caacgtgccc atctccatcc 24640 

catcgcgcaa ctgggcagca tttcgcggtt gggccttcac acgcttgaag acaaaggaaa 24900 

ccccttccct gggatcaggc tacgaccctt actacaccta ctctggctcc ataccatacc 24960 

ttgacggaac cttctatctt aatcacacct ttaagaaggt ggccattact tttgactctt 25020 

ctgttagctg gccgggcaac gaccgcctgc ttactcccaa tgagtttgag attaagcgct 25080 

cagttgacgg ggagggctat aacgtagctc agtgcaacat gacaaaggac tggttcctag 25140 

tgcagatgtt ggccaactac aatattggct accagggctt ctacattcca gaaagctaca 25200 

aagaccgcat gtactcgttc ttcagaaact tccagcccat gagccggcaa gtggtggacg 25260 
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atactaaata caaagattat cagcaggttg gaattatcca ccagcataac aactcaggct 25320 

tcgtaggcta cctcgctccc accatgcgcg agggacaagc ttaccccgct aatgttccct 25380 

acccactaat aggcaaaacc gcggttgata gtattaccca gaaaaagttt ctttgcgacc 25440 

gcaccctgtg gcgcatcccc ttctccagta actttatgtc catgggtgcg ctcacagacc 25500 

tgggccaaaa ccttctctac gcaaactccg cccacgcgct agacatgacc tttgaggtgg 25560 

atcccatgga cgagcccacc cttctttatg ttttgtttga agtctttgac gtggtccgtg 25620 

tgcaccagcc gcaccgcggc gtcatcgaga ccgtgtacct gcgcacgccc ttctcggccg 25680 

gcaacgccac aacataaaga agcaagcaac atcaacaaca gctgccgcca tgggctccag 25740 

tgagcaggaa ctgaaagcca ttgtcaaaga tcttggttgt gggccatatt ttttgggcac 25800 

ctatgacaag cgcttcccag gctttgtttc cccacacaag ctcgcctgcg ccatagttaa 25860 

cacggccggt cgcgagactg ggggcgtaca ctggatggcc tttgcctgga acccgcgctc 25920 

aaaaacatgc tacctctttg agccctttgg cttttctgac caacgtctca agcaggttta 25980 

ccagtttgag tacgagtcac tcctgcgccg tagcgccatt gcctcttccc ccgaccgctg 26040 

tataacgctg gaaaagtcca cccaaagcgt gcaggggccc aactcggccg cctgtggcct 26100 

attctgctgc atgtttctcc acgcctttgc caactggccc caaactccca tggatcacaa 26160 

ccccaccatg aaccttatta ccggggtacc caactccatg cttaacagtc cccaggtaca 26220 

gcccaccctg cgccgcaacc aggaacagct ctacagcttc ctggagcgcc actcgcccta 26280 

cttccgcagc cacagtgcgc aaattaggag cgccacttct ttttgtcact tgaaaaacat 26340 

gtaaaaataa tgtactagga gacactttca ataaaggcaa atgtttttat ttgtacactc 26400 

tcgggtgatt atttaccccc acccttgccg tctgcgccgt ttaaaaatca aaggggttct .26460 

gccgcgcatc gctatgcgcc actggcaggg acacgttgcg atactggtgt ttagtgctcc 26520 

acttaaactc aggcacaacc atccgcggca gctcggtgaa gttttcactc cacaggctgc 26580 

gcaccatcac caacgcgttt agcaggtcgg gcgccgatat cttgaagtcg cagttggggc 26640 

ctccgccctg cgcgcgcgag ttgcgataca cagggttaca gcactggaac actatcagcg 26700 

•ccgggtggtg cacgctggcc agcacgctct tgtcggagat cagatccgcg tccaggtcct 26760 

ccgcgttgct cagggcgaac ggagtcaact ttggtagctg ccttcccaaa aagggtgcat 26820 

gcccaggctt tgagttgcac tcgcaccgta gtggcatcag aaggtgaccg tgcccagtct 26880 

gggcgttagg atacagcgcc tgcatgaaag ccttgatctg cttaaaagcc acctgagcct 26940 

ttgcgccttc agagaagaac atgccgcaag acttgccgga aaactgattg gccggacagg 27000 

ccgcgtcatg cacgcagcac cttgcgtcgg tgttggagat ctgcaccaca tttcggcccc 27060 

accggttctt cacgatcttg gccttgctag actgctcctt cagcgcgcgc tgcccgtttt 27120 

cgctcgtcac atccatttca atcacgtgct ccttatttat cataatgctc ccgtgtagac 27180 

acttaagctc gccttcgatc tcagcgcagc ggtgcagcca caacgcgcag cccgtgggct 27240 

cgtggtgctt gtaggttacc tctgcaaacg actgcaggta cgcctgcagg aatcgcccca 27300 

tcatcgtcac aaaggtcttg ttgctggtga aggtcagctg caacccgcgg tgctcctcgt 27360 

ttagccaggt cttgcatacg gccgccagag cttccacttg gtcaggcagt agcttgaagt 27420 

ttgcctttag atcgttatcc acgtggtact tgtccatcaa cgcgcgcgca gcctccatgc 27480 

ccttctccca cgcagacacg atcggcaggc tcagcgggtt tatcaccgtg ctttcacttt 27540 

ccgcttcact ggactcttcc ttttcctctt gcatccgcat accccgcgcc actgggtcgt 27600 

cttcattcag ccgccgcacc gtgcgcctac ctcccttgcc gtgcttgatt agcaccggtg 27660 

ggttgctgaa acccaccatt tgtagcgcca catcttctct ttcttcetcg ctgtccacga 27720 

tcacctctgg ggatggcggg cgctcgggct tgggagaggg gcgcttcttt ttctttttgg 27780 

acgcaatggc caaatccgcc gtcgaggtcg atggccgcgg gctgggtgtg cgcggcacca 27840 

gcgcatcttg tgacgagtct tcttcgtcct cggactcgag acgccgcctc agccgctttt 27900 

ttgggggcgc gcggggaggc ggcggcgacg gcgacgggga cgagacgtcc tccatggttg 27960 

gtggacgtcg cgccgcaccg cgtccgcgct cgggggtggt ttcgcgctgc tcctcttccc 28020 

gactggccat ttccttctcc tataggcaga aaaagatcat ggagtcagtc gagaaggagg 28080 

acagcctaac cgcccccttt gagttcgcca ccaccgcctc caccgatgcc gccaacgcgc 28140 

ctaccacctt ccccgtcgag gcacccccgc ttgaggagga ggaagtgatt atcgagcagg 28200 

acccaggttt tgtaagcgaa gacgacgaag atcgctcagt accaacagag gataaaaagc 28260 

aagaccagga cgacgcagag gcaaacgagg aacaagtcgg gcggggggac caaaggcatg 28320 

gcgactacct agatgtggga gacgacgtgc tgttgaagca tctgcagcgc cagtgcgcba 28380 

ttatctgcga cgcgttgcaa gagcgcagcg atgtgcccct cgccatagcg gatgtcagcc 28440 

ttgcctacga acgccacctg ttctcaccgc gcgtaccccc caaacgccaa gaaaacggca 28500 

catgcgagcc caacccgcgc ctcaacttct accccgtatt tgccgtgcca gaggtgcttg 28560 
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ccacctatca catctttttc caaaactgca agatacccct atcctgccgt gccaaccgca 28620 

gccgagcgga caagcagctg gccttgcggc agggcgctgt catacctgat atcgcctcgc 28680 

tcgacgaagt gccaaaaatc tttgagggtc ttggacgcga cgagaagcgc gcggcaaacg 28740 

ctctgcaaca agaaaacagc gaaaatgaaa gtcactgtgg agtgctggtg gaacttgagg 28800 

gtgacaacgc gcgcctagcc gtgctgaaac gcagcatcga ggtcacccac tttgcctacc 28860 

cggcacttaa cctacccccc aaggttatga gcacagtcat gagcgagctg atcgtgcgcc 28920 

gtgcacgacc cctggagagg gatgcaaact tgcaagaaca aaccgaggag ggcctacccg 28980 

cagttggcga tgagcagctg gcgcgctggc ttgagacgcg cgagcctgcc gacttggagg 29040 

agcgacgcaa gctaatgatg gccgcagtgc ttgttaccgt ggagcttgag tgcatgcagc 29100 

ggttctttgc tgacccggag atgcagcgca agctagagga aacgttgcac tacacctttc 29160 

gccagggcta cgtgcgccag gcctgcaaaa tttccaacgt ggagctctgc aacctggtct 29220 

cctaccttgg aattttgcac gaaaaccgcc ttgggcaaaa cgtgcttcat tccacgctca 29280 

agggcgaggc gcgccgcgac tacgtccgcg actgcgttta cttatttctg tgctacacct 29340 

ggcaaacggc catgggcgtg tggcagcagt gcctggagga gcgcaacctg aaggagctgc 29400 

agaagctgct aaagcaaaac ttgaaggacc tatggacggc cttcaacgag cgctccgtgg 29460 

ccgcgcacct ggcggacatt atcttccccg aacgcctgct taaaaccctg caacagggtc 29520 

tgccagactt caccagtcaa agcatgttgc aaaactttag gaactttatc ctagagcgtt 29580 

caggaattct gcccgccacc tgctgtgcgc ttcctagcga ctttgtgccc attaagtacc 29640 

gtgaatgccc tccgccgctt tggggtcact gctaccttct gcagctagcc aactaccttg 29700 

cctaccactc cgacatcatg gaagacgtga gcggtgacgg cctactggag tgtcactgtc 29760 

gctgcaacct atgcaccccg caccgctccc tggtctgcaa ttcacaactg cttagcgaaa 29820 

gtcaaattat cggtaccttt gagctgcagg gtccctcgcc tgacgaaaag tccgcggctc 29880 

cggggttgaa actcactccg gggctgtgga cgtcggctta ccttcgcaaa tttgtacctg 29940 

aggactacca cgcccacgag attaggttct acgaagacca atcccgcccg ccaaatgcgg 30000 

agcttaccgc ctgcgtcatt acccagggcc acatccttgg ccaattgcaa gccattaaca 30060 

aagcccgcca agagtttctg ctacgaaagg gacggggggt ttacttggac ccccagtccg 30120 

gcgaggagct caacccaatc cccccgccgc cgcagcccta tcagcagccg cgggcccttg 30180 

cttcccagga tggcacccaa aaagaagctg cagctgccgc cgccgccacc cacggacgag 30240 

gaggaatact gggacagtca ggcagaggag gttttggacg aggaggagga gatgatggaa 30300 

gactgggaca gcctagacga ggaagcttcc gaggccgaag aggtgtcaga cgaaacaccg 303 60 

tcaccctcgg tcgcattccc ctcgccggcg ccccagaaat cggcaaccgt tcccagcatfc 30420 

gctacaacct ccgctcctca ggcgccgccg gcactgcccg ttcgccgacc caaccgtaga 30480 

tgggacacca ctggaaccag ggccggtaag tctaagcagc cgccgccgtt agcccaagag 30540 

caacaacagc gccaaggcta ccgctcgtgg cgcgtgcaca agaacgccat agttgcttgc 30600 

ttgcaagact gtgggggcaa catctccttc gcccgccgct ttcttctcta ccatcacggc 30660 

gtggccttcc cccgtaacat cctgcattac taccgtcatc tctacagccc ctactgcacc 30720 

ggcggcagcg gcagcaacag cagcggccac gcagaagcaa aggcgaccgg atagcaagac 30780 

tctgacaaag cccaagaaat ccacagcggc ggcagcagca ggaggaggag cactgcgtct 30840 

ggcgcccaac gaacccgtat cgacccgcga gcttagaaac aggatttttc ccactctgta 30900 

tgctatattt caacagagca ggggccaaga acaagagctg aaaataaaaa acaggtctct 30960 

gcgctccctc acccgcagct gcctgtatca caaaagcgaa gatcagcttc ggcgcacgct 31020 

ggaagacgcg gaggctctct tcagcaaata ctgcgcgctg actcttaagg actagtttcg 31080 

cgccctttct caaatttaag cgcgaaaact acgtcatctc cagcggccac acccggcgcc 31140 

agcacctgtc gtcagcgcca ttatgagcaa ggaaattccc acgccctaca tgtggagtta 31200 

ccagccacaa atgggacttg cggctggagc tgcccaagac tactcaaccc gaataaacta 31260 

catgagcgcg ggaccccaca tgatatcccg ggtcaacgga atccgcgccc accgaaaccg 31320 

aattctcctc gaacaggcgg ctattaccac cacacctcgt aataacctta atccccgtag 31380 

ttggcccgct gccctggtgt accaggaaag tcccgctccc accactgtgg tacttcccag 31440 

agacgcccag gccgaagttc agatgactaa ctcaggggcg cagcttgcgg gcggctttcg 31500 

tcacagggtg cggtcgcccg ggcagggtat aactcacctg aaaatcagag ggcgaggtat 31560 

tcagctcaac gacgagtcgg tgagctcctc tcttggtctc cgtccggacg ggacatttca 31620 

gatcggcggc gctggccgct cttcatttac gccccgtcag gcgatcctaa ctctgcagac 31680 

ctcgtcctcg gagccgcgct ccggaggcat tggaactcta caatttattg aggagttcgt 31740 

gccttcggtt tacttcaacc ccttttctgg acctcccggc cactacccgg accagtttat 31800 

tcccaacttt gacgcggtaa aagactcggc ggacggctac gactgaatga ccagtggaga 31860 
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ggcagagcaa ctgcgcctga cacacctcga ccactgccgc cgccacaagt gctttgcccg 31920 

cggctccggt gagttttgtt actttgaatt gcccgaagag catatcgagg gcccggcgca = 31980 

cggcgtccgg ctcaccaccc aggtagagct tacacgtagc ctgattcggg agtttaccaa 32040 

gcgccccctg ctagtggagc gggagcgggg tccctgtgtt ctgaccgtgg tttgcaactg - 32100 

tcctaaccct ggattacatc aagatcttat tccattcaac taacaataaa cacacaataa 32160 

attacttact taaaatcagt cagcaaatct ttgtccagct tattcagcat cacctccttt 32220 

ccctcctccc aactctggta tttcagcagc cttttagctg cgaactttct ccaaagtcta 32280 

aatgggatgt caaattcctc atgttcttgt ccctccgcac ccactatctt catattgttg 32340 

cagatgaaac gcgccagacc gtctgaagac accttcaacc ctgtgtaccc atatgacacg 32400 

gaaaccggcc ctccaactgt gcctttcctt acccctccct ttgtgtcgcc aaatgggttc 32460 

caagaaagtc cccccggagt gctttctttg cgtctttcag aacctttggt tacctcacac 32520 

ggcatgcttg cgctaaaaat gggcagcggc ctgtccctgg atcaggcagg caaccttaca 32580 

tcaaatacaa tcactgtttc tcaaccgcta aaaaaaacaa agtccaatat aactttggaa 32640 

acatccgcgc cccttacagt cagctcaggc gccctaacca tggccacaac ttcgcctttg . 32700 

gtggtctctg acaacactct taccatgcaa tcacaagcac cgctaaccgt gcaagactca 32760 

aaacttagca ttgctaccaa agagccactt acagtgttag atggaaaact ggccctgcag 32820 

acatcagccc ccctctctgc cactgataac aacgccctca ctatcactgc ctcacctcct 32880 

cttactactg caaatggtag tctggctgtt accatggaaa acccacttta caacaacaat 32940 

ggaaaacttg ggctcaaaat tggcggtcct ttgcaagtgg ccaccgactc acatgcacta 33000 

acactaggta ctggtcaggg ggttgcagtt cataacaatt tgctacatac aaaagttaca 33060 

ggcgcaatag ggtttgatac atctggcaac atggaactta aaactggaga tggcctctat 33120 

gtggatagcg ccggtcctaa ccaaaaacta catattaatc taaataccac aaaaggcctt 33180 

gcttttgaca acaccgcaat aacaattaac gctggaaaag ggttggaatt tgaaacagac 33240 

tcctcaaacg gaaatcccat aaaaacaaaa attggatcag gcatacaata taataccaat 33300 

ggagctatgg ttgcaaaact tggaacaggc ctcagttttg acagctccgg agccataaca 33360 

atgggcagca taaacaatga cagacttact ctttggacaa caccagaccc atccccaaat 33420 

tgcagaattg cttcagataa agactgcaag ctaactctgg cgctaacaaa atgtggcagt 33480 

caaattttgg gcactgtttc agctttggca gtatcaggta atatggcctc catcaatgga 33540 

actctaagea gtgtaaactt ggttcttaga tttgatgaca acggagtgct tatgtcaaat 33600 

tcatcactgg acaaacagta ttggaacttt agaaacgggg actccactaa cggtcaacca 33660 

tacacttatg ctgttgggtt tatgccaaac ctaaaagctt acccaaaaac tcaaagtaaa 33720 

actgcaaaaa gtaatattgt tagccaggtg tatcttaatg gtgacaagtc taaaccattg 33780 

cattttacta ttacgctaaa tggaacagat gaaaccaacc aagtaagcaa atactcaata 33840 

tcattcagtt ggtcctggaa cagtggacaa tacactaatg acaaatttgc caccaattcc 33900 

tataccttct cctacattgc ccaggaataa agaatcgtga acctgttgca tgttatgttt 33960 

caacgtgttt atttttcaat tgcagaaaat ttcaagtcat ttttcattca gtagtatagc 34020 

cccaccacca catagcttat actaatcacc gtaccttaat caaactcaca gaaccctagt 34080 

attcaacctg ccacctccct cccaacacac agagtacaca gtcctttctc cccggctggc 34140 

cttaaacagc atcatatcat gggtaacaga catattctta ggtgttatat tccacacggt 34200 

ctcctgtcga gccaaacgct catcagtgat gttaataaac tccccgggca gctcgcttaa 34260 

gttcatgtcg ctgtccagct gctgagccac aggctgctgt ccaacttgcg gttgctcaac 34320 

gggcggcgaa ggagaagtcc acgcctacat gggggtagag tcataatcgt gcatcaggat 34380 

agggcggtgg tgctgcagca gcgcgcgaat aaactgctgc cgccgccgct ccgtcctgca 34440 

ggaatacaac atggcagtgg tctcctcagc gatgattcgc accgcccgca gcataaggcg 34500 

ccttgtcctc cgggcacagc agcgcaccct gatctcactt aagtcagcac agtaactgca 34560 

gcacagtacc acaatattgt ttaaaatccc acagtgcaag gcgctgtatc caaagctcat 34620 

ggcggggacc acagaaccca cgtggccatc ataccacaag cgcaggtaga ttaagtggcg 34680 

acccctcata aacacgctgg acataaacat tacctctttt ggcatgttgt aattcaccac 34740 

ctcccggtac catataaacc tctgattaaa catggcgcca tccaccacca tcctaaacca 34800 

gctggccaaa acctgcccgc cggctatgca ctgcagggaa ccgggactgg aacaatgaca 34860 

gtggagagcc caggactcgt aaccatggat catcatgctc gtcatgatat caatgttggc 34920 

acaacacagg cacacgtgca tacacttcct caggattaca agctcctccc gcgtcagaac 34980 

catatcccag gcfaacaaccc attcctgaat cagcgtaaat cccacactgc agggaagacc 35040 

tcgcacgtaa ctcacgttgt gcattgtcaa agtgttacat tcgggcagca gcggatgatc 35100 

ctccagtatg gtagcgcggg tttctgtctc aaaaggaggt agacgatccc tactgtacgg 35160 
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agtgcgccga gacaaccgag atcgtgttgg tcgtagtgtc atgccaaatg gaacgccgga 35220 

cgtagtcata tttcctgaag caaaaccagg tgcgggcgtg acaaacagat ctgcgtctcc 35280 

ggtctcgccg cttagatcgc tctgtgtagt agttgtagta tatccactct ctcaaagcat 35340 

ccaggcgccc cctggcttcg ggttctatgt aaactccttc atgcgccgct gccctgataa 35400 

catccaccac cgcagaataa gccacaccca gccaacctac acattcgttc tgcgagtcac 35460 

acacgggagg agcgggaaga gctggaagaa ccatgttttt ttttttattc caaaagatta 35520 

tccaaaacct caaaatgaag atctattaag tgaacgcgct cccctccggt ggcgtggtca 35580 

aactctacag ccaaagaaca gataatggca tttgtaagat gttgcacaat ggcttccaaa 35640 

aggcaaacgg ccctcacgtc caagtggacg taaaggctaa acccttcagg gtgaatctcc 35700 

tctataaaca ttccagcacc ttcaaccatg cccaaataat tctcatctcg ccaccttctc 35760 

aatatatctc taagcaaatc ccgaatatta agtccggcca ttgtaaaaat ctgctccaga 35820 

gcgccctcca ccttcagcct caagcagcga atcatgattg caaaaattca ggttcctcac 35880 

agacctgtat aagattcaaa agcggaacat taacaaaaat accgcgatcc cgtaggtccc 35940 

ttcgcagggc cagctgaaca taatcgtgca ggtctgcacg gaccagcgcg gccacttccc 36000 

cgccaggaac catgacaaaa gaacccacac tgattatgac acgcatactc ggagctatgc 36060 

taaccagcgt agccccgatg taagcttgtt gcatgggcgg cgatataaaa tgcaaggtgc 36120 

tgctcaaaaa atcaggcaaa gcctcgcgca aaaaagaaag cacatcgtag tcatgctcat 36180 

gcagataaag gcaggtaagc tccggaacca ccacagaaaa agacaccatt tttctctcaa 36240 

acatgtctgc gggtttctgc ataaacacaa aataaaataa caaaaaaaca tttaaacatt 36300 

agaagcctgt cttacaacag gaaaaacaac ccttataagc ataagacgga ctacggccat 36360 

gccggcgtga ccgtaaaaaa actggtcacc gtgattaaaa agcaccaccg acagctcctc 36420 

ggtcatgtcc ggagtcataa tgtaagactc ggtaaacaca tcaggttgat tcacatcggt 36480 

cagtgctaaa aagcgaccga aatagcccgg gggaatacat acccgcaggc gtagagacaa 36540 

cattacagcc cccataggag gtataacaaa attaatagga gagaaaaaca cataaacacc 36600 

tgaaaaaccc tcctgcctag gcaaaatagc accctcccgc tccagaacaa catacagcgc 36660 

ttccacagcg gcagccataa cagtcagcct taccagtaaa aaagaaaacc tattaaaaaa 36720 

acaccactcg acacggcacc agctcaatca gtcacagtgt aaaaaagggc caagtgcaga 36780 

gcgagtatat ataggactaa aaaatgacgt aacggttaaa gtccacaaaa aacacccaga 36840 

aaaccgcacg cgaacctacg cccagaaacg aaagccaaaa aacccacaac ttcctcaaat 36900 

cgtcacttcc gttttcccac gttacgtcac ttcccatttt aagaaaacta caattcccaa 36960 

cacatacaag ttactccgcc ctaaaaccta cgtcacccgc cccgttccca cgccccgcgc 37020 

cacgtcacaa actccacccc ctcattatca tattggcttc aatccaaaat aaggtatatt 37080 

attgatgatg 37090 

<210> 5 
<211> 5955 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NS cDNA sequence 

<221> CDS 

<222> (1) . . . (5955) 



<400> 5 



atg gcg ccc ate acg 
Met Ala Pro He Thr 
1 5 



gec 
Ala 



tac tec caa cag acg egg ggc eta ctt ggt 
Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 
10 15 



48 



tgc ate ate act age 
Cys He He Thr Ser 
20 



ctt 
Leu 



aca ggc egg gac aag aac cag gtc gag gga 
Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 
25 30 



96 



gag gtt cag gtg gtt 



tec 



acc gca aca caa tec ttc ctg gcg ace tgc 



144 
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Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 
35 40 ; 45 - 

gtc aac ggc gtg tgt tgg acc gtt tac cat ggt get ggc tea aag acc 

Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 
50 55 60 



cag gac etc gtc ggc tgg cag gcg ccc ccc ggg gcg cgt tec ttg aca 
Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 
85 90 95 



192 



tta gec ggc cca aag ggg cca ate acc cag atg tac act aat gtg gac 240 
Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp 
65 70 75 80 



288 



cca tgc acc tgt ggc age tea gac ctt tac ttg gtc acg aga cat get 336 
Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
100 105 110 

gac gtc att ccg gtg cgc egg egg ggc gac agt agg ggg age ctg etc 384 
Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 
115 120 . 125 

tec ccc agg cct gtc tec tac ttg aag ggc tct teg ggt ggt cca ctg 432 
Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 
130 135 .140 

etc tgc cct teg ggg cac get gtg ggc ate ttc egg get gee gta tgc 480 
Leu Cys Pro Ser Gly His Ala Val Gly lie Phe Arg Ala Ala Val Cys 
145 150 155- 160 

acc egg ggg gtt gcg aag gcg gtg gac ttt gtg ccc gta gag tec atg 528 
Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 
165 170 175 

gaa act act atg egg tct ccg gtc ttc acg. gac aac tea tec ccc. ccg 576 
Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 
180 185 190 

gee gta ccg cag tea ttt caa gtg gee cac eta cac get ccc act ggc 624 
Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 
195 200 205 

age ggc aag agt act aaa gtg ccg get gca tat gca gee caa ggg tac 672 
Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 
210 215 220 

aag gtg etc gtc etc aat ccg tec gtt gee get acc tta ggg ttt ggg 72/>. 
Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
225 230 235 240 

gcg tat atg tct aag gca cac ggt att gac ccc aac ate aga act ggg 768 
Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly 
245 250 255 
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gta agg. acc att acc aca ggc gcc ccc gtc aca tac tct acc tat ggc 816 
Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly 
260 265 270 

aag ttt ctt gcc gat ggt ggt tgc tct ggg ggc get tat gac ate ata 864 
Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He 
275 280 285 

ata tgt gat gag tgc cat tea act gac teg act aca ate ttg ggc ate 912 
lie Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He 
290 295 300 

ggc aca gtc ctg gac caa gcg gag acg get gga gcg egg ctt gtc gtg 960 
Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
305 310 315 320 

etc gcc acc get acg cct ccg gga teg gtc acc gtg cca cac cca aac 1008 
Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
325 330 335 

ate gag gag gtg gcc ctg tct aat act gga gag ate ccc ttc tat ggc 1056 
He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly 
340 345 350 

aaa gcc ate ccc att gaa gcc ate agg ggg gga agg cat etc att ttc 1104 
Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe 
355 360 365 

tgt cat tec aag aag aag tgc gac gag etc gcc gca aag ctg tea ggc 1152 
Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly 
370 375 380 

etc gga ate aac get gtg gcg tat tac egg ggg etc gat gtg tec gtc 1200 
Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 

ata cca act ate gga gac gtc gtt gtc gtg gca aca gac get ctg atg 1248 
He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
405 410 415 

acg ggc tat acg ggc gac ttt gac tea gtg ate gac tgt aac aca tgt 1296 
Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 
420 425 430 

gtc acc cag aca gtc gac ttc age ttg gat ccc acc ttc acc att gag 1344 
Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
435 . 440 445 

acg acg acc gtg cct caa gac gca gtg teg cgc teg cag egg egg ggt 1392 
Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly 
450 455 460 

agg act ggc agg ggt agg aga ggc ate tac agg ttt gtg act ccg gga 1440 
Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly 
465 470 475 480 
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gaa egg ccc teg ggc atg ttc gat tec teg gtc ctg tgt gag tgc. tat 1488 
Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
485 490 495 

gac gcg ggc tgt get tgg tac gag etc ace ccc gec gag acc teg gtt 1536 . 
Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 
500 505 510 

agg ttg egg gec tac ctg aac aca cca ggg ttg ccc gtt tgc cag gac 1584 
Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
515 520 525 

cac ctg gag ttc tgg gag agt gtc ttc aca - ggc etc acc cac ata gat 1632 
His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie As£> 
530 535 540 

gca cac ttc ttg tec cag acc aag cag gca gga gac aac ttc ccc tac 1680 
Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
545 550 555 560 

ctg gta gca tac caa gec acg gtg tgc gec agg get cag gec cca cct 1728 
Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
565 570 575 

cca tea tgg gat caa atg tgg aag tgt etc ata egg ctg aaa cct acg 1776 
Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu Lys Pro Thr 
580 585 590 

ctg cac ggg cca aca ccc ttg ctg tac agg ctg gga gee gtc caa aat 18 ?4 
Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Glh Ash 
595 600 605 

gag gtc acc etc acc cac ccc ata acc aaa tac ate atg gca tgc atg 1872 
Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met 
610 615 620 

teg get gac ctg gag gtc gtc act age acc tgg gtg ctg gtg ggc gga 1920 
Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
625 630 635 640 

gtc ctt gca get ctg gee gcg tat tgc ctg aca aca ggc agt gtg gtc . 1968 
Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 
645 650 655 

att gtg ggt agg att ate ttg tec ggg agg ccg get att gtt ccc gac 2016 
lie Val Gly Arg He He Leu Ser Gly Arg Pro Ala He Val Pro Asp 
660 665 670 

agg gag ttt etc. tac cag gag ttc gat gaa atg gaa gag tgc gec teg 2064 
Arg Glu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 

675 680 685 , : 

cac etc cct tac ate gag cag gga atg cag etc gee gag caa ttc aag 2112 
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His Leu Pro Tyr lie Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 
690 695 700 

cag aaa gcg etc ggg tta ctg caa aca gec acc aaa caa gcg gag get 2160 
Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
705 710 715 720 

get get ccc gtg gtg gag tec aag tgg cga gec ctt gag aca ttc tgg 2208 
Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 
725 730 735 

gcg aag cac atg tgg aat ttc ate age ggg ata cag tac tta gca ggc 2256 
Ala Lys His Met Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala Gly 
740 745 750 

tta tec act ctg ccc ggg aac ccc gca ata gca tea ttg atg gca ttc 2304 
Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu Met Ala Phe 
755 760 76S 

aca gec tct ate acc age ccg etc acc acc caa agt acc etc ctg ttt 2352 
Thr Ala Ser lie Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 
770 775 780 

aac ate ttg ggg ggg tgg gtg get gee caa etc gee ccc ccc age gec 2400 
Asn lie Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
785 790 795 800 

get teg get ttc gtg ggc gee ggc ate gec ggt gcg get gtt ggc age 2448 
Ala Ser Ala Phe Val Gly Ala Gly lie Ala Gly Ala Ala Val Gly Ser 
805 810 815 

ata ggc ctt ggg aag gtg ctt gtg gac att ctg gcg ggt tat gga gca 2496 
lie Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly Ala 
820 825 830 

gga gtg gee ggc gcg etc gtg gec ttc aag gtc atg age ggc gag atg 2544 
Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 
835 840 845 

ccc tec acc gag gac ctg gtc aat eta ctt cct gec ate etc tct ccc 2592 
Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 
850 855 860 

ggc gee ctg gtc gtc ggg gtc gtg tgt gca gca ata ctg cgt cga cac 2640 
Gly Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His 
865 870 875 880 

gtg ggt ccg gga gag ggg get gtg cag tgg atg aac egg ctg ata gcg 2688 
Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 
885 890 895 

ttc gee teg egg ggt aat cat gtt tec ccc acg cac tat gtg cct gag 2736 
Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 
900 905 910 
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age gac gec gca gcg cgt gtt act cag ate etc tec age ctt ace ate 2784 
Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu Thr He 
915 920 925 

act cag ctg ctg aaa agg etc cac cag tgg att aat gaa gac tgc tec 2832 
Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu Asp Cys Ser 
930 935 940 



aca ccg tgt tec ggc teg tgg eta agg gat gtt tgg gac tgg ata tgc 
Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp 'Trp He Cys 
945 950 955 960 



2880 



acg gtg ttg act gac ttc aag ace tgg etc cag tec aag etc ctg ccg 
Thr Val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro 
965 970 975 



2928 



cag eta ccg gga gtc cct ttt ttc teg tgc caa cgc ggg tac aag gga 
Glh Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly 
980 985 990 



2976 



gtc tgg egg gga gac ggc ate atg caa acc ace tgc cca tgt gga gca 
Val Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly Ala 
995 1000 1005 



3024 



cag ate acc gga cat gtc aaa aac ggt tec atg agg ate gtc ggg cct 
Glh He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val Gly Pro 
1010 1015 1020 



3072 



aag acc tgc age aac acg tgg cat gga aca ttc ccc ate aac gca tac 
Lys Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn Ala Tyr 
1025 1030 1035 1040 



3120 



acc acg ggc ccc tgc aca ccc tct cca gcg cca aac tat tct agg gcg 
Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala 
1045 1050 1055 



3168 



ctg tgg egg gtg gee get gag gag tac gtg gag gtc acg egg gtg ggg 
Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu. Val Thr Arg Val Gly 
1060 1065 1070 



3216 



gat ttc cac tac gtg acg ggc atg acc act gac aac gta aag tgc cca 
.Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro 
1075 1080 1085 



3264 



tgc cag gtt ccg get cct gaa ttc ttc acg gag gtg gac gga gtg egg 
Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg 
1090 1095 1100 



3312 



ttg cac agg tac get ccg gcg tgc agg cct etc eta egg gag gag gtt 
Leu His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val 
1105 1110 1115 1120 



3360 



aca ttc cag gtc ggg etc aac caa tac ctg gtt ggg tea cag eta cca 
Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro 
1125 1130 1135 



3408 
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tgc gag ccc gaa ccg gat gta gca gtg etc act tec atg etc acc gac 
Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 
1140 1145 1150 



3456 



ccc tec cac ate aca gca gaa acg get aag cgt agg ttg gec agg ggg 
Pro Ser His He Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly 
1155 1160 1165 



3504 



tct ccc ccc tec ttg gec age tct tea get age cag ttg tct gcg cct 
Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 
1170 1175 1180 



3552 



tec ttg aag gcg aca tgc act acc cac cat gtc tct ccg gac get gac 
Ser Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp 
1185 1190 1195 1200 



3600 



etc ate gag gec aac etc ctg tgg egg cag gag atg ggc ggg aac ate 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
1205 1210 1215 

acc cgc gtg gag teg gag aac aag gtg gta gtc ctg gac tct ttc gac 
Thr Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp 
1220 1225 1230 



3648 



3696 



ccg ctt cga gcg gag gag gat gag agg gaa gta tec gtt ccg gcg gag 
Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu 
1235 1240 1245 



3744 



ate ctg egg aaa tec aag aag ttc ccc gca gcg atg ccc ate tgg gcg 
He Leu Arg Lys Ser Lys Lys Phe Pro Ala Ala Met Pro He Trp Ala 
1250 1255 1260 



3792 



cgc ccg gat tac aac cct cca ctg tta gag tec tgg aag gac ccg gac 
Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp 
1265 1270 1275 1280 



3840 



tac gtc cct ccg gtg gtg cac ggg tgc ccg ttg cca cct ate aag gec 
Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro He Lys Ala 
1285 1290 1295 



3888 



cct cca ata cca cct cca egg aga aag agg acg gtt gtc eta aca gag 
Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu 
1300 1305 1310 



3936 



tec tec gtg tct tct gec tta gcg gag etc get act aag acc ttc ggc 
Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 
1315 1320 1325 



3984 



age tec gaa tea teg gec gtc gac age ggc acg gcg acc gec ctt cct 
Ser Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro 
1330 1335 1340 



4032 



gac cag gec tec gac gac ggt gac aaa gga tec gac gtt gag teg tac 



4080 
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Asp Gin Ala Ser. Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr 
1345 1350 1355 ' 1360 

tec tec atg ccc ccc ctt gag ggg gaa ccg ggg gac ccc gat etc agt 4128 
Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 
1365 1370 1375 

gac ggg tct tgg tct acc gtg age gag gaa get agt gag gat gtc gtc 4176 
Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 
1380 1385 1390 

tgc tgc tea atg tec tac aca tgg aca ggc gee ttg ate acg cca tgc 4224 
Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu lie Thr Pro Cys 
1395 1400 1405 



get gcg gag gaa age aag ctg ccc ate aac gcg ttg age aac tct ttg 
Ala Ala Glu Glu Ser Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu 
1410 1415 1420 



4272 



ctg cgc cac cat aac atg gtt tat gee aca aca tct cgc age gca ggc 
Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly 
1425 1430 1435 1440 



4320. 



ctg egg cag aag aag gtc acc ttt gac aga ctg caa gtc ctg gac gac 
Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 
1445 1450 1455 



4368 



cac tac egg gac gtg etc aag gag atg aag gcg aag gcg tec aca gtt 
His tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 
1460 1465 1470 



4416 



aag get aaa etc eta tec gta gag gaa gee tgc aag ctg acg ccc cca 
Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 
1475 1480 1485 



4464 



cat teg gec aaa tec aag ttt ggc tat ggg gca aag gac gtc egg aac 
His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn 
1490 1495 1500 



4512 



eta tec age aag gee gtt aac cac ate cac tec gtg tgg aag gac ttg 
Leu Ser Ser Lys Ala Val Asn His lie His Ser Val Trp Lys Asp Leu 
1505 1510 1515 1520 



4560 



ctg gaa gac act gtg aca cca att gac acc acc ate atg gca aaa aat 
Leu Glu Asp Thr Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn 
1525 1530 1535 



4608 



gag gtt ttc tgt gtc caa cca gag aaa gga ggc cgt aag cca gec cgc 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
1540 . 1545 1550 



4656 



ctt ate gta ttc cca gat ctg gga gtc cgt gta tgc gag aag atg gee 
Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 
1555 1560 1565 



4704 



28/64. 



WO 03/031588 



PCT/DS02/32512 



etc tat. gat gtg gtc tec acc ctt cct cag gtc gtg atg ggc tec tea 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser 
1570 1575 1580 



4752 



tac gga ttc cag tac tct cct ggg cag cga gtc gag ttc ctg gtg aat 
Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 
1585 1590 1595 1600 



4800 



acc tgg aaa tea aag aaa aac ccc atg ggc ttt tea tat gac act cgc 
Thr Tip Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg 
1605 1610 1615 



4848 



tgt ttc gac tea acg gtc acc gag aac gac ate cgt gtt gag gag tea 
Cys Phe Asp Ser Thr Val Thr Glu Asn Asp lie Arg Val Glu Glu Ser 
1620 1625 1630 



4896 



att tac caa tgt tgt gac ttg gee ccc gaa gee aga cag gee ata aaa 4944 

He Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys 
1635 1640 1645 

teg etc aca gag egg ctt tat ate ggg ggt cct ctg act aat tea aaa 4992 

Ser Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys 
1650 1655 1660 



ggg cag aac tgc ggt tat cgc egg tgc cgc gcg age ggc gtg ctg acg 
Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 
1665 1670 1675 1680 



5040 



act age tgc ggt aac acc etc aca tgt tac ttg aag gec tet gca gec 5088 
Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 
1685 1690 1695 

tgt cga get gcg aag etc cag gac tgc acg atg etc gtg aac gga gac 5136 
Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly Asp 
1700 1705 1710 

gac ctt gtc gtt ate tgt gaa age gcg gga acc caa gag gac gcg gcg 5184 
Asp Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 
1715 1720 1725 

age eta cga gtc ttc acg gag get atg act agg tac tct gec ccc ccc 5232 
Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro 
1730 1735 1740 



ggg gac ccg ccc caa cca gaa tac gac ttg gag ctg ata aca tea tgt 
Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys 
1745 1750 1755 1760 



5280 



tec tec aat gtg teg gtc gec cac gat gca tea ggc aaa agg gtg tac 5328 
Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr 
1765 1770 1775 

tac etc acc cgt gat ccc acc acc ccc etc gca egg get gcg tgg gaa 5376 
Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu 
1780 1785 1790 
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aca get aga cac act cca gtt aac tec tgg eta ggc aac att ate atg 
Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met 
1795 1800 1805 



5424 



tat gcg ccc act ttg tgg gca agg atg att ctg atg act cac ttc ttc 
Tyr Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His. Phe. Phe 
1810 1815 1820 



5472 



tec ate ctt eta gca cag gag caa ctt gaa aaa gee ctg gac tgc cag 
Ser lie Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin 
1825 1830 1835 1840 



5520 



ate tac ggg gec tgt tac tec att gag cca ctt gac eta cct cag ate 
He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin He 
1845 1850 1855 



5568 



att gaa cga etc cat ggc ctt age gca ttt tea etc cat agt tac tct 
He Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser 
I860 1865 1870 



5616 



cca ggt gag ate aat agg gtg get tea tgc etc agg aaa ctt ggg gta 
Pro Gly Glu lie Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val 
1875 1880 1885 



5664 



cca ccc ttg cga gtc tgg aga cat egg gee agg age gtc cgc get agg 
Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 
1890 1895 1900 



5712 



eta ctg tec cag ggg ggg agg gee gee act tgt ggc aag tac etc ttc - 
Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe 
1905 1910 1915 1920 



5760 



aac tgg gca gtg aag ace aaa etc aaa etc act cca ate ccg get gcg 
Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala 
1925 1930 1935 



5808 



tec cag ctg gac ttg tec ggc tgg ttc gtt get ggt tac age ggg gga 
Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly 
1940 1945 1950 



5856 



gac ata tat cac age ctg tct cgt gee cga ccc cgc tgg ttc atg ctg 
Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu 
1955 1960 1965 



5904 



tgc eta etc eta ctt tct gta ggg gta ggc ate" tac ctg etc ccc aac 
Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn 
1970 1975 1980 



5952 



cga 
Arg 
1985 



5955 



<210> 6 
<211> 1984 
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<212> PRT 

<213> Artificial Sequence 
<220> 

<223> NS sequence 
<400> 6 

Ala Pro lie Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys 

15 10 15 

lie He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu 

20 25 30 

Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val 

35 40 45 

Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu 

50 55 60 

Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp Gin 
65 70 75 80 

Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro 

85 90 95 

Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp 

100 105 110 

Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser 

115 120 125 

Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu 

130 135 140 

Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr 
145 150 155 160 

Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu 

165 170 175 

Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala 

180 185 190 

Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser 

195 200 205 

Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys 

210 215 220 

Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala 
225 230 235 240 

Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val 

245 250 255 

Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys 

260 265 270 

Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He 

275 280 285 

Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He Gly 

290 295 300 

Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu 
305 310 315 320 

Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He 

325 330 335 

Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly Lys 

340 345 350 

Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe Cys 

355 360 365 

His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu 
370 375 380 
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Glv 


He 


Asn 


Ala 


Val Ala Tyr Tyr Arg Gly 


Leu 


Asp Val 


Ser Val 


He 


385 








390 


395 








400 


Pro 


Thr 


He 


Glv 


Asp Val Val Val val Ala 


Thr 


Asp Ala Leu Met 


Thr 










405 410 








415 




Glv 


Tvr 


Thr 


Glv 


Asp Phe Asp Ser Val He 


Asp 


Cys Asn Thr Cys 


Val 






420 


425 








430 




Thr 


Gin 


Thr 


Val 


Asp Phe Ser Leu Asp Pro 


Thr 


Phe 


Thr 


He Glu 


Thr 






435 




440 






445 






Thr 


Thr 


Val 


Pro 


Gin Asn Ala Val Ser Aro; 


Ser 


Gin Arg Arg Gly 


Arg 




450 






455 




a a n 








Thr 


Civ 


Arg 


Glv 


Arcr At"CT Glv Tie Tvr Arcr 


Phe 


\Ta 1 
Vai 


Thr 


Pro Gly 


Glu 


465 








470 


475 








480 


Axy 


Pro 


Cot* 

Dei 


Glv 


Met Phe Asn Ser Ser Val 


Leu 


Cys 


Glu Cys Tyr 


Asp 






485 490 








495 




Ala 


Civ 


w Jf 2> 


Ala 


Trr* Tvr Gin Leu Thr Pro 


Ala 


GlU 


Thr 


Ser Val 


Arcr 








500 


505 








510 




LSU 


T\ yrrt 

Arg 


TV 1 r> 
M.JL Cl 


*y* 


T.oi ■ tv dn T*V» t" Pro Glv TjOii 


Pro 


vai 


Cys Gin Asp 


His 


* 




515 




520 






525 




Ala 


Leu 


r»i ii 


true 


Trp 


Gl ii Cor V7p 1 PVio Thr Glv 

V7JLV1 Oci VAX Sri ICS X I IX. V3i j 


Leu 


Thr 


His 


He Asp 




JJ w . 






535 




540 






Mi q 




T on 


Car 
Dei 


Gin Thr T.v«5 Gin Ala Glv 


Asp 


Asn 


Phe 


Pro Tyr 


Leu 










550 


555 








560 


Vol 


tv n 

nla 


iyr 


Gl n 


ai a Thr* Val Gv<5 Ala Ar*cr 

AlU X 11X VOL J. Wj£t rt.-l.ca rtJ. vj 


Ala 


Gin 


Ala 


Pro Pro 


Pro 










565 570 












Cot" 

OCX, 




Asp 


Gin 


Met" Tm Lvs Cvs Leu Tie 


Arcr 


Leu 


Lys 


Pro Thr 


Leu 








580 


585 








590 




His 


Glv 


Pro 


Thr 


Pro Leu Leu Tyr Arg Leu 


Gly 


Ala 


Val 


Gin Asn 


Glu 






595 




600 






605 






Val 


Thr 


Leu 


Thr 


Hi s Pro He Thr Lys Tyr 


He 


Met 


Ala 


Cys Met 


Ser 




610 






615 




620 








Ala 


Asp 


Leu 


Glu 


Val Val Thr Ser Thr Trp 


Val 


Leu 


Val 


Gly Gly 


Val 


625 








63 0 


635 








640 


Leu 


Ala 


Ala 


Leu 


Ala Ala Tyr Cys Leu Thr 


Thr 


Gly 


Ser 


Val Val 


He 










645 650 








655 




Val 


Gly 


Arg 


He 


He Leu Ser Gly Arg Pro 


Ala 


He 


Val 


Pro Asp 


Arg 








660 


665 








670 




Glu 


Phe 


Leu 


Tyr 


Gin Glu Phe Asp Glu Met 


Glu 


Glu 


Cys 


Ala Ser 


His 






675 




680 






685 






Leu 


Pro 


Tyr 


He 


Glu Gin Gly Met Gin Leu 


Ala 


GlU 


Gin 


Phe Lys 


Gin 




690 






695 




700 








Lys 


Ala 


Leu 


Gly 


Leu Leu Gin Thr Ala Thr 


Lys 


Gin Ala 


Glu Ala 


Ala 


705 








710 


715 








720 


Ala 


Pro 


Val 


Val 


Glu Ser Lys Trp Arg Ala 


Leu 


Glu 


Thr 


Phe Trp 


Ala 










725 730 








735 




Lys 


His 


Met 


Trp 


Asn Phe He Ser Gly He 


Gin 


Tyr Leu 


Ala Gly 


Leu 








740 


745 








750 




Ser 


Thr 


Leu 


Pro 


Gly Asn Pro Ala He Ala 


Ser 


Leu 


Met 


Ala Phe 


Thr 






755 




760 






765 






Ala 


Ser 


He 


Thr 


Ser Pro Leu Thr Thr Gin 


Ser 


Thr 


Leu 


Leu Phe 


Asn 




770 






775 




780 








He 


Leu 


Gly 


Gly 


Trp Val Ala Ala Gin Leu 


Ala 


Pro 


Pro 


Ser Ala 


Ala 


785 








790 


795 








800 


Ser 


Ala 


Phe 


Val 


Gly Ala Gly He Ala Gly 


Ala 


Ala Val 


Gly Ser 


He 










805 810 








815, 




Gly 


Leu 


Gly 


Lys 


Val Leu Vai Asp He Leu 


Ala 


Gly Tyr 


Gly Ala 


Gly 



32/64 



WO 03/031588 



PCT/US02/32512 



.820 825 830 

Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met Pro 

835 840 845 

Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu' Ser Pro Gly 

850 855 860 

Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val 
865 870 875 880 

Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe 

885 890 895 

Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser 

900 905 910 

Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu Thr He Thr 

915 920 925 

Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu Asp Cys Ser Thr 

930 935 940 

Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp He Cys Thr 
945 950 955 960 

Val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro Gin 

965 970 975 

Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly Val 

980 985 990 

Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly Ala Gin 

995 1000 1005 

He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val Gly Pro Lys 

1010 1015 1020 

Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn Ala Tyr Thr 
1025 1030 1035 1040 

Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala Leu 

1045 1050 1055 

Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val Gly Asp 

1060 1065 1070 

Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro Cys 

1075 1080 1085 

Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg Leu 

1090 1095 1100 

His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val Thr 
1105 1110 1H5 H20 

Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro Cys 

1125 H30 H35 

Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro 

1140 H45 1150 

Ser His He Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser 

1155 1160 1165 

Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser 

1170 1175 1180 

Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp Leu 
1185 1190 H95 1200 

He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr 

1205 1210 1215 

Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp Pro 

1220 1225 1230 

Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu He 

1235 1240 1245 

Leu Arg Lys Ser Lys Lys Phe Pro Ala Ala Met Pro He Trp Ala Arg 
1250 1255 1260 
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Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp Tyr 
1265 1270 1275 1280 

Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro lie Lys Ala Pro 

1285 1290 1295 

Pro lie Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser 

1300 1305 1310 

Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly Ser 

1315 1320 1325 

Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro Asp 

1330 1335 1340 

Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr Ser 
1345 1350 1355 1360 

Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp 

1365 1370 1375 

Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val Cys 

1380 1385 1390 

Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys Ala 

1395 1400 1405 

Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu 

1410 1415 1420 

Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly Leu 
1425 1430 1435 1440 

Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp His 

1445 1450 1455 

Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys 

1460 1465 1470 

Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His 

1475 1480 1485 

Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu 

1490 1495 1500 

Ser Ser Lys Ala Val Asn His He His Ser Val Trp Lys Asp Leu Leu 
1505 1510 1515 1520 

Glu Asp Thr Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu 

1525 1530 1535 

Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 

1540 1545 1550 

lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 

1555 1560 1565 

Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser.. Ser Tyr 

1570 1575 1580 

Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn Thr 
1585 1590 1595 1600 

Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 

1605 1610 1615 

Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser He 

1620 1625 1630 

Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys Ser 

1635 1640 1645 

Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys Gly 

1650 1655 1660 

Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1665 1670 1675 1680 

Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys 
1685 1690 1695 
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Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly Asp Asp 

1700 1705 1710 

Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala Ser 

1715 1720 1725 

Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 

1730 1735 1740 

Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser 
1745 1750 1755 1760 

Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr Tyr 

1765 1770 1775 

Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 

1780 1785 1790 

Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Tyr 

1795 1800 1805 

Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe Ser 

1810 1815 1820 

He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin He 
1825 1830 1835 1840 

Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin He He 

1845 1850 1855 

Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 

1860 1865 1870 

Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val Pro 

1875 1880 1885 

Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 

1890 1895 1900 

Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe Asn 
1905 1910 1915 1920 

Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala Ser 

1925 1930 1935 

Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly Asp 

1940 1945 1950 

He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu Cys 

1955 1960 1965 

Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1970 1975 1980 



<210> 7 
<211> 4909 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pVU nucleic acid 



<400> 7 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 

cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 

ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 

accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 

ctattggcca ttgcatacgt tgtatccata tcataatatg tacatttata ttggctcatg 300 

tccaacatta ccgccatgtt gacattgatt attgactagt tattaatagt aatcaattac 360 

ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 420 

cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 480 

catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 540 
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tgcccacttg 
tgacggtaaa 
ttggcagtac 
catcaatggg 
cgtcaatggg 
ctccgcccca 
agctcgttta 
tagaagacac 
tccccgtgcc 
tcttatgcat 
taggtgatgg 
tattggtgac 
tattggctat 
ggatggggtc 
cgcagttttt 
catgggctct 
agcggctcat 
agcacaatgc 
gaaaatgagc 
gcagaagaag 
gttgcggtgc 
cgcgccacca 
tgcagtcacc 
atctgctgtg 
gaccctggaa 
ttgtctgagt 
ggattgggaa 
caggtgctga 
ctctgtgaca 
tcatagctca 
ccctccctca 
aagataggct 
gagaaatcat 
gctgcggcga 
' ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 
ataccaggcc 
agctttgttg 
tgcgttgtcg 
acaaagccgc 
aattctgatt 



gcagtacatc 

tggcccgcct 

atctacgtat 

cgtggatagc 

agtttgtttt 

ttgacgcaaa 

gtgaaccgtc 

cgggaccgat 

aagagtgacg 

gctatactgt 

tatagcttag 

gatactttcc 

atgccaatac 

ccatttatta 

attaaacata 

tctccggtag 

ggtcgctcgg 

ccaccaccac 

gtggagattg 

atgcaggcag 

tgttaacggt 

gacataatag 

gtccttagat 

ccttctagtt 

ggtgccactc 

aggtgtcatt 

gacaatagca 

agaattgacc 

caccctgtcc 

ggagggctcc 

tcagcccacc 

attaagtgca 

agaatttctt 

gcggtatcag 

ggaaagaaca 

ctggcgtttt 

cagaggtggc 

ctcgtgcgct 

tcgggaagcg 

gttcgctcca 

tccggtaact 

gccactggta 

tggtggccta 

ccagttacct 

agcggtggtt 

gatcctttga 

attttggtca 

agttttaaat 

atcagtgagg 

gggggggggg 
tgaatcgccc 
taggtggacc 
ggaagatgcg 
cgtcccgtca 
agaaaaactc 



aagtgtatca 

ggcattatgc 

tagtcatcgc 

ggtttgactc 

ggcaccaaaa 

tgggcggtag 

agatcgcctg 

ccagcctccg 

taagtaccgc 

ttttggcttg 

cctataggtg 

attactaatc 

tctgtccttc 

tttacaaatt 

gcgtgggatc 

cggcggagct 

cagctccttg 

cagtgtgccg 

ggctcgcacg 

ctgagttgtt 

ggagggcagt 

ctgacagact 

ctaggtacca 

gccagccatc 

ccactgtcct 

ctattctggg 

ggcatgctgg 

cggttcctcc 

acgcccctgg 

gccttcaatc 

aaaccaaacc 

gagggagaga 

ccgcttcctc 

ctcactcaaa 

tgtgagcaaa 

tccataggct 

gaaacccgac 

ctcctgttcc 

tggcgctttc 

age tgggctg 

ategtcttga 

acaggattag 

actaeggcta 

teggaaaaag 

tttttgtttg 

tcttttctac 

tgagattatc 

caatctaaag 

cacctatctc 

ggegctgagg 

catcatccag 

agtcggtgat 

tgatctgatc 

agtcagcgta 

atcgagcatc 



tatgecaagt 

ccagtacatg 

tattaccatg 

aeggggattt 

teaaegggae 

gcgtgtacgg 

gagacgecat 

eggcegggaa 

ctatagactc 

gggectatae 

tgggttattg 

cataacatgg 

agagactgac 

cacatataca 

tccacgcgaa 

tccacatccg 

ctcctaacag 

cacaaggccg 

getgaegcag 

gtattctgat 

gtagtctgag 

aacagactgt 

gatatcagaa 

tgttgtttgc 

ttcctaataa 

gggtggggtg 

ggatgcggtg 

tgggccagaa 

ttcttagttc 

ccacccgcta 

tagcctccaa 

aaatgcctcc 

gctcactgac 

ggeggtaata 

aggecagcaa 

ccgcccccct 

aggactataa 

gaccctgccg 

tcatagctca 

tgtgcacgaa 

gtccaacccg 

cagagegagg 

cactagaaga 

agttggtagc 

caagcagcag 

ggggtctgac 

aaaaaggatc 

tatatatgag 

agegatctgt 

tctgcctcgt 

ccagaaagtg 

tttgaacttt 

cttcaactca 

atgctctgcc 

aaatgaaact 



acgcccccta. 

accttatggg 

gtgatgcggt 

ccaagtctcc 

tttccaaaat 

tgggaggtct 

ccacgctgtt 

cggtgcattg 

tataggcaca 

acccccgctt 

accattattg 

ctctttgcca 

aeggactctg 

acaacgccgt 

tetegggtae 

agccctggtc 

tggaggccag 

tggeggtagg 

atggaagact 

aagagtcaga 

cagtactcgt 

tcctttccat 

ttcagtcgac 

ccctcccccg 

aatgaggaaa 

gggcaggaca 

ggctctatgg 

agaagcaggc 

cagccccact 

aagtacttgg 

gagtgggaag 

aacatgtgag 

tcgctgcgct 

eggttatcca 

aaggccagga 

gacgagcatc 

agataccagg 

ettaceggat 

cgctgtaggt 

ccccccgttc 

gtaagacacg 

tatgtaggcg 

acagtatttg 

tcttgatccg 

attacgegea 

gctcagtgga 

ttcacctaga 

taaacttggt 

etatttegtt 

gaagaaggtg 

agggagecac 

tgetttgeca 

gcaaaagttc 

agtgttacaa 

gcaatttatt 



ttgacgtcaa 

actttcctac 

tttggcagta 

accccattga 

gtcgtaacaa 

atataagcag 

ttgacctcca 

gaacgeggat 

cccctttggc 

ecttatgeta 

accactcccc 

caactatctc 

tatttttaca 

cccccgtgcc 

gtgttccgga 

ccatgcctcc 

acttaggcac 

gtatgtgtct 

taaggcagcg 

ggtaactccc 

tgctgccgcg 

gggtcttttc 

agcggccgcg 

tgccttcctt 

ttgeatcgea 

gcaaggggga 

ccgctgcggc 

acatcccctt 

cataggacac 

ageggtctet 

aaattaaagc 

gaagtaatga 

eggtegtteg 

cagaatcagg 

acegtaaaaa 

acaaaaatcg 

cgtttccccc 

acctgtccgc 

atctcagttc 

agcccgaccg 

acttatcgcc 

gtgetacaga 

gtatctgege 

gcaaacaaac 

gaaaaaaagg 

acgaaaactc 

tccttttaaa 

ctgacagtta 

catccatagt 

ttgetgaetc 

ggttgatgag 

eggaaeggtc 

gatttattca 

ccaattaacc 

catatcagga 



600 
660 
720 
780 
840 
900 
960. 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
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ttatcaatac catatttttg aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg 3900 

cagttccata ggatggcaag atcctggtat cggtctgcga ttccgactcg tccaacatca 3960 

atacaaccta ttaatttccc ctcgtcaaaa ataaggttat caagtgagaa atcaccatga 4020 

gtgacgactg aatccggtga gaatggcaaa agcttatgca tttctttcca gacttgttca 4080 

acaggccagc cattacgctc gtcatcaaaa tcactcgcat caaccaaacc gttattcatt 4140 

cgtgattgcg cctgagcgag acgaaatacg cgatcgctgt taaaaggaca attacaaaca 4200 

ggaatcgaat gcaaccggcg caggaacact gccagcgcat caacaatatt ttcacctgaa 4260 

tcaggatatt cttctaatac ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac 4320 

catgcatcat caggagtacg gataaaatgc ttgatggtcg gaagaggcat aaattccgtc 4380 

agccagttta gtctgaccat ctcatctgta acatcattgg caacgctacc tttgccatgt 4440 

ttcagaaaca actctggcgc atcgggcttc ccatacaatc gatagattgt cgcacctgat 4500 

tgcccgacat tatcgcgagc ccatttatac ccatataaat cagcatccat gttggaattt 4560 

aatcgcggcc tcgagcaaga cgtttcccgt tgaatatggc tcataacacc ccttgtatta 4620 

ctgtttatgt aagcagacag ttttattgtt catgatgata tatttttatc ttgtgcaatg 4680 

taacatcaga gattttgaga cacaacgtgg ctttcccccc ccccccatta ttgaagcatt 4740 

tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa 4800 

ataggggttc cgcgcacatt tccccgaaaa gtgccacctg acgtctaaga aaccattatt 4860 

atcatgacat taacctataa aaataggcgt atcacgaggc cctttcgtc 4909 

<210> 8 
<211> 35935 
<212> DNA 

<213> Adenovirus serotype 6 
<400> 8 

catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt 60 

ttgtgacgtg gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 120 

gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 180 

gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 240 

taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 300 

agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 360 

gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 420 

cgggtcaaag ttggcgtttt attattatag tcagctgacg tgtagtgtat ttatacccgg 480 

tgagttcctc aagaggccac tcttgagtgc cagcgagtag agttttctcc tccgagccgc 540 

tccgacaccg ggactgaaaa tgagacatat tatctgccac ggaggtgtta ttaccgaaga 600 

aatggccgcc agtcttttgg accagctgat cgaagaggta ctggctgata atcttccacc 660 

tcctagccat tttgaaccac ctacccttca cgaactgtat gatttagacg tgacggcccc 720 

cgaagatccc aacgaggagg cggtttcgca gatttttccc gactctgtaa tgttggcggt 780 

gcaggaaggg attgacttac tcacttttcc gccggcgccc ggttctccgg agccgcctca 840 

cctttcccgg cagcccgagc agccggagca gagagccttg ggtccggttt ctatgccaaa 900 

ccttgtaccg gaggtgatcg atcttacctg ccacgaggct ggctttccac ccagtgacga 960 

cgaggatgaa gagggtgagg agtttgtgtt agattatgtg gagcaccccg ggcacggttg 1020 

caggtcttgt cattatcacc ggaggaatac gggggaccca gatattatgt gttcgctttg 1080 

ctatatgagg acctgtggca tgtttgtcta cagtaagtga aaattatggg cagtgggtga 1140 

tagagtggtg ggtttggtgt ggtaattttt tttttaattt ttacagtttt gtggtttaaa 1200 

gaattttgta ttgtgatttt tttaaaaggt cctgtgtctg aacctgagcc tgagcccgag 1260 

ccagaaccgg agcctgcaag acctacccgc cgtcctaaaa tggcgcctgc tatcctgaga 1320 

cgcccgacat cacctgtgtc tagagaatgc aatagtagta cggatagctg tgactccggt 1380 

ccttctaaca cacctcctga gatacacccg gtggtcccgc tgtgccccat taaaccagtt 1440 

gccgtgagag ttggtgggcg tcgccaggct gtggaatgta tcgaggactt gcttaacgag 1500 

cctgggcaac ctttggactt gagctgtaaa cgccccaggc cataaggtgt aaacctgtga 1560 

ttgcgtgtgt ggttaacgcc tttgtttgct gaatgagttg atgtaagttt aataaagggt 1620 

gagataatgt ttaacttgca tggcgtgtta aatggggcgg ggcttaaagg gtatataatg 1680 

cgccgtgggc taatcttggt tacatctgac ctcatggagg cttgggagtg tttggaagat 1740 

ttttctgctg tgcgtaactt gctggaacag agctctaaca gtacctcttg gttttggagg 1800 
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tttctgtggg gctcatccca ggcaaagtta gtctgcagaa ttaaggagga ttacaagtgg . 1860 

gaatttgaag agcttttgaa atcctgtggt gagctgtttg attctttgaa tctgggtcac 1920 

caggcgcttt tccaagagaa ggtcatcaag actttggatt tttccacacc ggggcgcgct 1980 

gcggctgctg ttgctttttt gagttttata aaggataaat ggagcgaaga aacccatctg 2040 

agcggggggt acctgctgga ttttctggcc atgcatctgt ggagagcggt tgtgagacac 2100 

aagaatcgcc tgctactgtt gtcttccgtc cgcccggcga taataccgac ggaggagcag 2160 

cagcagcagc aggaggaagc caggcggcgg cggcaggagc agagcccatg gaacccgaga 2220 

gccggcctgg accctcggga atgaatgttg tacaggtggc tgaactgtat ccagaactga 2280 

gacgcatttt gacaattaca gaggatgggc aggggctaaa gggggtaaag agggagcggg 2340 

gggcttgtga ggctacagag gaggctagga atctagcttt tagcttaatg accagacacc 2400 

gtcctgagtg tattactttt caacagatca aggataattg cgctaatgag cttgatctgc .2460 

tggcgcagaa gtattccata gagcagctga ccacttactg gctgcagcca ggggatgatt . 2520 

ttgaggaggc tattagggta tatgcaaagg tggcacttag gccagattgc aagtacaaga 2580 

tcagcaaact tgtaaatatc aggaattgtt gctacatttc tgggaacggg gccgaggtgg 2640 

agatagatac ggaggatagg gtggccttta gatgtagcat gataaatatg tggccggggg 2700 

tgcttggcat ggacggggtg gttattatga atgtaaggtt tactggcccc aattttagcg ; 2760 

gtacggtttt cctggccaat accaacctta tcctacacgg tgtaagcttc tatgggttta 2820 

acaatacctg tgtggaagcc tggaccgatg taagggttcg gggctgtgcc ttttactgct 2880 

gctggaaggg ggtggtgtgt cgccccaaaa gcagggcttc aattaagaaa tgcctctttg 2940 

aaaggtgtac cttgggtatc ctgtctgagg gtaactccag ggtgcgccac aatgtggcct 3000 

ccgactgtgg ttgcttcatg ctagtgaaaa gcgtggctgt gattaagcat aacatggtat 3060 

gtggcaactg cgaggacagg gcctctcaga tgctgacctg ctcggacggc aactgtcacc 3120 

tgctgaagac cattcacgta gccagccact ctcgcaaggc ctggccagtg tttgagcata 3180 

acatactgac ccgctgttcc ttgcatttgg gtaacaggag gggggtgttc ctaccttacc . . 3240 

aatgcaattt gagtcacact aagatattgc ttgagcccga gagcatgtcc aaggtgaacc 3300 

tgaacggggt gtttgacatg accatgaaga tctggaaggt gctgaggtac gatgagaccc 3360 

gcaccaggtg cagaccctgc gagtgtggcg gtaaacatat taggaaccag cctgtgatgc 3420 

tggatgtgac cgaggagctg aggcccgatc acttggtgct ggcctgcacc cgcgctgagt 3480 

ttggctctag cgatgaagat acagattgag gtactgaaat gtgtgggcgt ggcttaaggg 3540 

tgggaaagaa tatataaggt gggggtctta tgtagttttg tatctgtttt gcagcagccg 3600 

ccgccgccat gagcaccaac tcgtttgatg gaagcattgt gagctcatat ttgacaacgc 3660 

gcatgccccc atgggccggg gtgcgtcaga atgtgatggg ctccagcatt gatggtcgcc 3720 

ccgtcctgcc cgcaaactct actaccttga cctacgagac cgtgtctgga acgccgttgg 3780 

agactgcagc ctccgccgcc gcttcagccg ctgcagccac cgcccgcggg attgtgactg 3840 

actttgcttt cctgagcccg cttgcaagca gtgcagcttc ccgttcatcc gcccgcgatg 3900 

acaagttgac ggctcttttg gcacaattgg attctttgac ccgggaactt aatgtcgttt 3960 

ctcagcagct gttggatctg cgccagcagg tttctgccct gaaggcttcc tcccctccca 4020 

atgcggttta aaacataaat aaaaaaccag actctgtttg gatttggatc aagcaagtgt 4080 

cttgctgtct ttatttaggg gttttgcgcg cgcggtaggc ccgggaccag cggtctcggt 4140 

cgttgagggt cctgtgtatt ttttccagga cgtggtaaag gtgactctgg atgttcagat 4200 

acatgggcat aagcccgtct ctggggtgga ggtagcacca ctgcagagct tcatgctgcg 4260 

gggtggtgtt gtagatgatc cagtcgtagc aggagcgctg ggcgtggtgc ctaaaaatgt 4320 

ctttcagtag caagctgatt gccaggggca ggcccttggt gtaagtgttt acaaagcggt 4380 

taagctggga tgggtgcata cgtggggata tgagatgcat cttggactgt atttttaggt 4440 

tggctatgtt cccagccata tccctccggg gattcatgtt gtgcagaacc accagcacag 4500 

tsftatccggt gcacttggga aatttgtcat gtagcttaga aggaaatgcg tggaagaact 4560 

tggagacgcc cttgtgacct ccaagatttt ccatgcattc gtccataatg atggcaatgg 4620 

gcccacgggc ggcggcctgg gcgaagatat ttctgggatc actaacgtca tagttgtgtt 4680 

ccaggatgag atcgtcatag gccattttta caaagcgcgg gcggagggtg ccagactgcg 4740 

gtataatggt tccatccggc ccaggggcgt agttaccctc acagatttgc atttcccacg 4800 

btttgagttc agatgggggg atcatgtcta cctgcggggc gatgaagaaa acggtttccg 4860 

gggtagggga gatcagctgg gaagaaagca ggttcctgag cagctgcgac ttaccgcagc 4920 

cggtgggccc gtaaatcaca cctattaccg ggtgcaactg gtagttaaga gagctgcagc 4980 

tgccgtcatc cctgagcagg ggggccactt cgttaagcat gtccctgact cgcatgtttt 5040 

ccctgaccaa atccgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag 5100 
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caaagttttt caacggtttg agaccgtccg ccgtaggcat gcttttgagc gtttgaccaa 5160 

gcagttccag gcggtcccac agctcggtca cctgctctac ggcatctcga tccagcatat 5220 

ctcctcgttt cgcgggttgg ggcggctttc gctgtacggc agtagtcggt gctcgtccag 5280 

acgggccagg gtcatgtctt tccacgggcg cagggtcctc gtcagcgtag tctgggtcac 5340 

ggtgaagggg tgcgctccgg gctgcgcgct ggccagggtg cgcttgaggc tggtcctgct 5400 

ggtgctgaag cgctgccggt cttcgccctg cgcgtcggcc aggtagcatt tgaccatggt 5460 

gtcatagtcc agcccctccg cggcgtggcc cttggcgcgc agcttgccct tggaggaggc 5520 

gccgcacgag gggcagtgca gacttttgag ggcgtagagc ttgggcgcga gaaataccga 5580 

ttccggggag taggcatccg cgccgcaggc cccgcagacg gtctcgcatt ccacgagcca 5640 

ggtgagctct ggccgttcgg ggtcaaaaac caggtttccc ccatgctttt tgatgcgttt 5700 

cttacctctg gtttccatga gccggtgtcc acgctcggtg acgaaaaggc tgtccgtgtc 57 60 

cccgtataca gacttgagag gcctgtcctc gagcggtgtt ccgcggtcct cctcgtatag 5820 

aaactcggac cactctgaga caaaggctcg cgtccaggcc agcacgaagg aggctaagtg 5880 

ggaggggtag cggtcgttgt ccactagggg gtccactcgc tccagggtgt gaagacacat 5940 

gtcgccctct tcggcatcaa ggaaggtgat tggtttgtag gtgtaggcca cgtgaccggg 6000 

tgttcctgaa ggggggctat aaaagggggt gggggcgcgt tcgtcctcac tctcttccgc 6060 

atcgctgtct gcgagggcca gctgttgggg tgagtactcc ctctgaaaag cgggcatgac 6120 

ttctgcgcta agattgtcag tttccaaaaa cgaggaggat ttgatattca cctggcccgc 6180 

ggtgatgcct ttgagggtgg ccgcatccat ctggtcagaa aagacaatct ttttgttgtc 6240 

aagcttggtg gcaaacgacc cgtagagggc gttggacagc aacttggcga tggagcgcag 6300 

ggtttggttt ttgtcgcgat cggcgcgctc cttggccgcg atgtttagct gcacgtattc 6360 

gcgcgcaacg caccgccatt cgggaaagac ggtggtgcgc tcgtcgggca ccaggtgcac 6420 

gcgccaaccg cggttgtgca gggtgacaag gtcaacgctg gtggctacct ctccgcgtag 6480 

gcgctcgttg gtccagcaga ggcggccgcc cttgcgcgag cagaatggcg gtagggggtc 6540 

tagctgcgtc tcgtccgggg ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc 6600 

gtcgaagtag tctatcttgc atccttgcaa gtctagcgcc tgctgccatg cgcgggcggc 6660 

aagcgcgcgc tcgtatgggt tgagtggggg accccatggc atggggtggg tgagcgcgga 6720 

ggcgtacatg ccgcaaatgt cgtaaacgta gaggggctct ctgagtattc caagatatgt 6780 

agggtagcat cttccaccgc ggatgctggc gcgcacgtaa tcgtatagtt cgtgcgaggg 6840 

agcgaggagg tcgggaccga ggttgctacg ggcgggctgc tctgctcgga agactatctg 6900 

cctgaagatg gcatgtgagt tggatgatat ggttggacgc tggaagacgt tgaagctggc 6960 

gtctgtgaga cctaccgcgt cacgcacgaa ggaggcgtag gagtcgcgca gcttgttgac 7020 

cagctcggcg gtgacctgca cgtctagggc gcagtagtcc agggtttcct tgatgatgtc 7080 

atacttatcc tgtccctttt ttttccacag ctcgcggttg aggacaaact cttcgcggtc 7140 

tttccagtac tcttggatcg gaaacccgtc ggcctccgaa cggtaagagc ctagcatgta 7200 

gaactggttg acggcctggt aggcgcagca tcccttttct acgggtagcg cgtatgcctg 7260 

cgcggccttc cggagcgagg tgtgggtgag cgcaaaggtg tccctgacca tgactttgag 7320 

gtactggtat ttgaagtcag tgtcgtcgca tccgccctgc tcccagagca aaaagtccgt 7380 

gcgctttttg gaacgcggat ttggcagggc gaaggtgaca tcgttgaaga gtatctttcc 7440 

cgcgcgaggc ataaagttgc gtgtgatgcg gaagggtccc ggcacctcgg aacggttgtt 7500 

aattacctgg gcggcgagca cgatctcgtc aaagccgttg atgttgtggc ccacaatgta 7560 

aagttccaag aagcgcggga tgcccttgat ggaaggcaat tttttaagtt cctcgtaggt 7620 

gagctcttca ggggagctga gcccgtigctc tgaaagggcc cagtctgcaa gatgagggtt 7 680 

ggaagcgacg aatgagctcc acaggtcacg ggccattagc atttgcaggt ggtcgcgaaa 7740 

ggtcctaaac tggcgaccta tggccatttt ttctggggtg atgcagtaga aggtaagcgg 7800 

gtcttgttcc cagcggtccc atccaaggtt cgcggctagg tctcgcgcgg cagtcactag 7860 

aggctcatct ccgccgaact tcatgaccag catgaagggc acgagctgct tcccaaaggc 7 920 

ccccatccaa gtataggtct ctacatcgta ggtgacaaag agacgctcgg tgcgaggatg 7980 

cgagccgatc gggaagaact ggatctcccg ccaccaattg gaggagtggc tattgatgtg 8040 

gtgaaagtag aagtccctgc gacgggccga acactcgtgc tggcttttgt aaaaacgtgc 8100 

gcagtactgg cagcggtgca cgggctgtac atcctgcacg aggttgacct gacgaccgcg 8160 

cacaaggaag cagagtggga atttgagccc ctcgcctggc gggtttggct ggtggtcttc 8220 

tacttcggct gcttgtcctt gaccgtctgg ctgctcgagg ggagttacgg tggatcggac 8280 

caccacgccg cgcgagccca aagtccagat gtccgcgcgc ggcggtcgga gcttgatgac 8340 

aacatcgcgc agatgggagc tgtccatggt ctggagctcc cgcggcgtca ggtcaggcgg 8400 
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gagctcctgc 

cctaatttcc 

cggcgcgact 

atctaaaagc 

agagggggca 

ttgctggcga 

acgacgggcc 

ttgacggcgg 

tcggccatga 

gtggcggcga 

tcgttccaga 

tgcgcgagat 

aggtagttga 

aacgtggatt 

acggcgaagt 

cggatgagct 

tcttcttcaa 

ggagggggga 

atctccccgc 

agttggaaga 

agggatacgg 

gacctgagcg 

tcacagtcgc 

tttctggcgg 

gtcgacagaa 

ccccaggctt 

accggcactt 

gcggcggagt 

ctcatcggct 

acctgcgtga 

ttgatggtgt 

gagagctcgg 

gtccgcacca 

cagcgtaggg 

tagatgtacc 

cggacgcggt 

ccggtcaggc 

ggcactcttc 

tcgagccccg 

caggtgtgcg 

gctgctgcgc 

gcgaaagcat 

gcgggacccc 

ccgtcatgca 

ttttcccaga 

caagagcagc 

acatccgcgg 

cactacctgg 

cggtacccaa 

ctgtttcgcg 

gggcgcgagc 

cccgacgcgc 

accgcatacg 

gtgcgtacgc 

gtaagcgcgc 



aggtttacct 

aggggctggt 

acggtaccgc 

ggtgacgcgg 

ggggcacgtc 

acgcgacgac 

cggtgagctt 

cctggcgcaa 

actgctcgat 

ggtcgttgga 

cgcggctgta 

tgagctccac 

gggtggtggc 

cgttgatatc 

tgaaaaactg 

cggcgacagt 

tctcctcttc 

cacggcggcg 

ggcgacggcg 

cgccgcccgt 

cgctaacgat 

agtccgcatc 

aaggtaggct 

aggtgctgct 

gcaccatgtc 

cgttttgaca 

cttcttctcc 

ttggccgtag 

gaagcagggc 

gggtagactg 

aagtgcagtt 

tgtacctgag 

ggtactggta 

tggccggggc 

tggacatcca 

tccagatgtt 

gcgcgcaatc 

cgtggtctgg 

tatccggccg 

acgtcagaca 

tagctttttt 

taagtggctc 

cggttcgagt 

agaccccgct 

tgcatccggt 

ggcagacatg 

ttgacgcggc 

acttggagga 

gggtgcagct 

accgcgaggg 

tgcggcatgg 

gaaccgggat 

agcagacggt 

fctgtggcgcg 

tggagcaaaa 



cgcatagacg 

tggtggcggc 

gcggcgggcg 

gcgagccccc 

ggcgccgcgc 

gcggcggttg 

gagcctgaaa 

aatctcctgc 

ctcttcctcc 

aatgcgggcc 

gaccacgccc 

gtgccgggcg 

ggtgtgttct 

ccccaaggcc 

ggagttgcgc 

gtcgcgcacc 

cataagggcc 

acgacggcgc 

catggtctcg 

catgtcccgg 

gcatctcaac 

gaccggatcg 

gagcaccgtg 

gatgatgtaa 

cttgggtccg 

tcggcgcagg 

ttcctcttgt 

gtggcgccct 

taggtcggcg 

gaagtcatcc 

ggccataacg 

acgcgagtaa 

tcccaccaaa 

tccgggggcg 

ggtgatgccg 

gcgcagcggc 

gttgacgctc 

tggataaatt 

tccgccgtga 

acgggggagt 

ggccactggc 

gctccctgta 

ctcggaccgg 

tgcaaattcc 

gctgcggcag 

cagggcaccc 

agcagatggt 

gggcgagggc 

gaagcgtgat 

agaggagccc 

cctgaatcgc 

tagtcccgcg 

gaaccaggag 

cgaggaggtg 

cccaaatagc 



ggtcagggcg 

gtcgatggct 

gtgggccgcg 

ggaggtaggg 

gcgggcagga 

atctcctgaa 

gagagttcga 

acgtctcctg 

tggagatctc 

atgagctgcg 

ccttcggcat 

aagacggcgt 

gccacgaaga 

tcaaggcgct 

gccgacacgg 

tcgcgctcaa 

tccccttctt 

accgggaggc 

gtgacggcgc 

ttatgggttg 

aattgttgtg 

gaaaacctct 

gcgggcggca 

ttaaagtagg 

gcctgctgaa 

tctttgtagt 

cctgcatctc 

cttcctccca 

acaacgcgct 

atgtccacaa 

gaccagttaa 

gccctcgagt 

aagtgcggcg 

agatcttcca 

gcggcggtgg 

aaaaagtgct 

tagaccgtgc 

cgcaagggta 

tccatgcggt 

gctccttttg 

cgcgcgcagc 

gccggagggt 

ccggactgcg 

tccggaaaca 

atgcgccccc 

tcccctcctc 

gattacgaac 

ctggcgcggc 

acgcgtgagg 

gaggagatgc 

gagcggttgc 

cgcgcacacg 

attaactttc 

gctataggac 

aagccgctca 



cgggctagat 

tgcaagaggc 

ggggtgtcct 

ggggctccgg 

gctggtgctg 

tctggcgcct 

cagaatcaat 

agttgtcttg 

cgcgtccggc 

agaaggcgtt 

cgcgggcgcg 

agtttcgcag 

agtacataac 

ccatggcctc 

ttaactcctc 

aggctacagg 

cttcttctgg 

ggtcgacaaa 

ggccgttctc 

gcggggggct 

taggtactcc 

cgagaaaggc 

gcgggcggcg 

cggtcttgag 

tgcgcaggcg 

agtcttgcat 

ttgcatctat 

tgcgtgtgac 

cggctaatat 

agcggtggta 

cggtc tggtg 

caaatacgta 

gcggctggcg 

acataaggcg 

tggaggcgcg 

ccatggtcgg 

aaaaggagag 

tcatggcgga 

taccgcccgc 

gcttccttcc 

gtaagcggtt 

tattttccaa 

gcgaacgggg 

gggacgagcc 

ctcctcagca 

ctaccgcgtc 

ccccgcggcg 

taggagcgcc 

cgtacgtgcc 

gggatcgaaa 

tgcgcgagga 

tggcggccgc 

aaaaaagctt 

tgatgcatct 

tggcgcagct 



ccaggtgata 

cgcatccccg 

tggatgatgc 

acccgccggg 

cgcgcgtagg 

ctgcgtgaag 

ttcggtgtcg 

ataggcgatc 

tcgctccacg 

gaggcctccc 

catgaccacc 

gcgctgaaag. 

ccagcgtcgc 

gtagaagtcc 

ctccagaaga 

ggcctcttct 

cggcggtggg 

gcgctcgatc 

gcgggggcgc 

gccatgcggc 

gccgccgagg 

gtctaaccag 

gtcggggttg 

acggcggatg 

gtcggccatg 

gagcctttct 

cgctgcggcg 

cccgaagccc 

ggcctgctgc 

tgcgcccgtg 

acccggctgc 

gtcgttgcaa 

gtagaggggc 

atgatatccg 

cggaaagtcg 

gacgctctgg 

cctgtaagcg 

cgaccggggt 

gtgtcgaacc 

aggcgcggcg 

aggctggaaa 

gggttgagtc 

gtttgcctcc 

ccttttttgc 

gcggcaagag 

aggaggggcg 

ccgggcccgg 

ctctcctgag 

gcggcagaac 

gttccacgca 

ggactttgag 

cgacctggta 

taacaaccac 

gtgggacttt 

gttccttata 
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gtgcagcaca gcagggacaa cgaggcattc agggatgcgc tgctaaacat agtagagccc 11760 

gagggccgct ggctgctcga tttgataaac atcctgcaga gcatagtggt gcaggagcgc 11820 

agcttgagcc tggctgacaa ggtggccgcc atcaactatt ccatgcttag cctgggcaag 11880 

ttttacgccc gcaagatata ccatacccct tacgttccca tagacaagga ggtaaagatc 11940 

gaggggttct acatgcgcat ggcgctgaag gtgcttacct tgagcgacga cctgggcgtt 12000 

tatcgcaacg agcgcatcca caaggccgtg agcgtgagcc ggcggcgcga gctcagcgac 12060 

cgcgagctga tgcacagcct gcaaagggcc ctggctggca cgggcagcgg cgatagagag 12120 

gccgagtcct actttgacgc gggcgctgac ctgcgctggg ccccaagccg acgcgccctg 12180 

gaggcagctg gggccggacc tgggctggcg gtggcacccg cgcgcgctgg caacgtcggc 12240 

ggcgtggagg aatatgacga ggacgatgag tacgagccag aggacggcga gtactaagcg 12300 

gtgatgtttc tgatcagatg atgcaagacg caacggaccc ggcggtgcgg gcggcgctgc 12360 

agagccagcc gtccggcctt aactccacgg acgactggcg ccaggtcatg gaccgcatca 12420 

tgtcgctgac tgcgcgcaat cctgacgcgt tccggcagca gccgcaggcc aaccggctct 12480 

ccgcaattct ggaagcggtg gtcccggcgc gcgcaaaccc cacgcacgag aaggtgctgg 12540 

cgatcgtaaa cgcgctggcc gaaaacaggg ccatccggcc cgacgaggcc ggcctggtct 12600 

acgacgcgct gcttcagcgc gtggctcgtt acaacagcgg caacgtgcag accaacctgg 12660 

accggctggt gggggatgtg cgcgaggccg tggcgcagcg tgagcgcgcg cagcagcagg 12720 

gcaacctggg ctccatggtt gcactaaacg ccttcctgag tacacagccc gccaacgtgc 12780 

cgcggggaca ggaggactac accaactttg tgagcgcact gcggctaatg gtgactgaga 12840 

caccgcaaag tgaggtgtac cagtctgggc cagactattt tttccagacc agtagacaag 12900 

gcctgcagac cgtaaacctg agccaggctt tcaaaaactt gcaggggctg tggggggtgc 12960 

gggctcccac aggcgaccgc gcgaccgtgt ctagcttgct gacgcccaac tcgcgcctgt 13020 

tgctgctgct aatagcgccc ttcacggaca gtggcagcgt gtcccgggac acatacctag 13080 

gtcacttgct gacactgtac cgcgaggcca taggtcaggc gcatgtggac gagcatactt 13140 

tccaggagat tacaagtgtc agccgcgcgc tggggcagga ggacacgggc agcctggagg 13200 

caaccctaaa ctacctgctg accaaccggc ggcagaagat cccctcgttg cacagtttaa 13260 

acagcgagga ggagcgcatt ttgcgctacg tgcagcagag cgtgagcctt aacctgatgc 13320 

gcgacggggt aacgcccagc gtggcgctgg acatgaccgc gcgcaacatg gaaccgggca 13380 

tgtatgcctc aaaccggccg tttatcaacc gcctaatgga ctacttgcat cgcgcggccg 13440 

ccgtgaaccc cgagtatttc accaatgcca tcttgaaccc gcactggcta ccgccccctg 13500 

gtttctacac cgggggattc gaggtgcccg agggtaacga tggattcctc tgggacgaca 13560 

tagacgacag cgtgttttcc ccgcaaccgc agaccctgct agagttgcaa cagcgcgagc 13620 

aggcagaggc ggcgctgcga aaggaaagct tccgcaggcc aagcagcttg tccgatctag 13680 

gcgctgcggc cccgcggtca gatgctagta gcccatttcc aagcttgata gggtctctta 13740 

ccagcactcg caccacccgc ccgcgcctgc tgggcgagga ggagtaccta aacaactcgc 13800 

tgctgcagcc gcagcgcgaa aaaaacctgc ctccggcatt tcccaacaac gggatagaga 13860 

gcctagtgga caagatgagt agatggaaga cgtacgcgca ggagcacagg gacgtgccag 13920 

gcccgcgccc gcccacccgt cgtcaaaggc acgaccgtca gcggggtctg gtgtgggagg 13980 

acgatgactc ggcagacgac agcagcgtcc tggatttggg agggagtggc aacccgtttg 14040 

cgcaccttcg ccccaggctg gggagaatgt tttaaaaaaa aaaaagcatg atgcaaaata 14100 

aaaaactcac caaggccatg gcaccgagcg ttggttttct tgtattcccc ttagtatgcg 14160 

gcgcgcggcg atgtatgagg aaggtcctcc tccctcctac gagagtgtgg tgagcgcggc 14220 

gccagtggcg gcggcgctgg gttctccctt cgatgctccc ctggacccgc cgtttgtgcc 14280 

tccgcggtac ctgcggccta ccggggggag aaacagcatc cgttactctg agttggcacc 14340 

cctattcgac accacccgtg tgtacctggt ggacaacaag tcaacggatg tggcatccct 14400 

gaactaccag aacgaccaca gcaactttct gaccacggtc attcaaaaca atgactacag 14460 

cccgggggag gcaagcacac agaccatcaa tcttgacgac cggtcgcact ggggcggcga 14520 

cctgaaaacc atcctgcata ccaacatgcc aaatgtgaac gagttcatgt ttaccaataa 14580 

gtttaaggcg cgggtgatgg tgtcgcgctt gcctactaag gacaatcagg tggagctgaa 14640 

atacgagtgg gtggagttca cgctgcccga gggcaactac tccgagacca tgaccataga 14700 

ccttatgaac aacgcgatcg tggagcacta cttgaaagtg ggcagacaga acggggttct 14760 

ggaaagcgac atcggggtaa agtttgacac ccgcaacttc agactggggt ttgaccccgt 14820 

cactggtctt gtcatgcctg gggtatatac aaacgaagcc ttccatccag acatcatttt 14880 

gctgccagga tgcggggtgg acttcaccca cagccgcctg agcaacttgt tgggcatccg 14940 

caagcggcaa cccttccagg agggctttag gatcacctac gatgatctgg agggtggtaa 15000 
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cattcccgca ctgttggatg tggacgccta ccaggcgagc ttgaaagatg acaccgaaca 15060 

gggcgggggt ggcgcaggcg gcagcaacag cagtggcagc ggcgcggaag agaactccaa 15120 

cgcggcagcc gcggcaatgc agccggtgga ggacatgaac gatcatgcca ttcgcggcga 15180 

cacctttgcc acacgggctg aggagaagcg cgctgaggcc gaagcagcgg ccgaagctgc 15240 

cgcccccgct gcgcaacccg aggtcgagaa gcctcagaag aaaccggtga tcaaacccct 15300 

gacagaggac agcaagaaac gcagttacaa cctaataagc aatgacagca ccttcaccca 15360 

gtaccgcagc tggtaccttg catacaacta cggcgaccct cagaccggaa tccgctcatg 15420 

gaccctgctt tgcactcctg acgtaacctg cggctcggag caggtctact ggtcgttgcc 15480 

agacatgatg caagaccccg tgaccttccg ctccacgcgc cagatcagca actttccggt 15540 

ggtgggcgcc gagctgttgc ccgtgcactc caagagcttc tacaacgacc aggccgtcta 15600 

ctcccaactc atccgccagt ttacctctct gacccacgtg ttcaatcgct ttcccgagaa 15660 

ccagattttg gcgcgcccgc cagcccccac catcaccacc gtcagtgaaa acgttcctgc 15720 

tctcacagat cacgggacgc taccgctgcg caacagcatc ggaggagtcc agcgagtgac 15780 

cattactgac gccagacgcc gcacctgccc ctacgtttac aaggccctgg gcatagtctc 15840 

gccgcgcgtc ctatcgagcc gcactttttg agcaagcatg tccatcctta tatcgcccag 15900 

caataacaca ggctggggcc tgcgcttccc aagcaagatg tttggcgggg ccaagaagcg 15960 

ctccgaccaa cacccagtgc gcgtgcgcgg gcactaccgc gcgccctggg gcgcgcacaa 16020 

acgcggccgc actgggcgca ccaccgtcga tgacgccatc gacgcggtgg tggaggaggc 16080 

gcgcaactac acgcccacgc cgccaccagt gtccacagtg gacgcggcca ttcagaccgt 16140 

ggtgcgcgga gcccggcgct atgctaaaat gaagagacgg cggaggcgcg tagcacgtcg 16200 

ccaccgccgc cgacccggca ctgccgccca acgcgcggcg gcggccctgc ttaaccgcgc 16260 

acgtcgcacc ggccgacggg cggccatgcg ggccgctcga aggctggccg cgggtattgt 16320 

cactgtgccc cccaggtcca ggcgacgagc ggccgccgca gcagccgcgg ccattagtgc 16380 

tatgactcag ggtcgcaggg gcaacgtgta ttgggtgcgc gactcggtta gcggcctgcg 16440 

cgtgcccgtg cgcacccgcc ccccgcgcaa ctagattgca agaaaaaact acttagactc 16500 

gtactgttgt atgtatccag cggcggcggc gcgcaacgaa gctatgtcca agcgcaaaat 16560 

caaagaagag atgctccagg tcatcgcgcc ggagatctat ggccccccga agaaggaaga 16620 

gcaggattac aagccccgaa agctaaagcg ggtcaaaaag aaaaagaaag atgatgatga 16680 

tgaacttgac gacgaggtgg aactgctgca cgctaccgcg cccaggcgac gggtacagtg 16740 

gaaaggtcga cgcgtaaaac gtgttttgcg acccggcacc accgtagtct ttacgcccgg 16800 

tgagcgctcc acccgcacct acaagcgcgt gtatgatgag gtgtacggcg acgaggacct 16860 

gcttgagcag gccaacgagc gcctcgggga gtttgcctac ggaaagcggc ataaggacat 16920 

gctggcgttg ccgctggacg agggcaaccc aacacctagc ctaaagcccg taacactgca 16980 

gcaggtgctg cccgcgcttg caccgtccga agaaaagcgc ggcctaaagc gcgagtctgg 17040 

tgacttggca cccaccgtgc agctgatggt acccaagcgc cagcgactgg aagatgtctt 17100 

. ggaaaaaatg accgtggaac ctgggctgga gcccgaggtc cgcgtgcggc caatcaagca 17160 

ggtggcgccg ggactgggcg tgcagaccgt ggacgttcag atacccacta ccagtagcac 17220 

cagtattgcc accgccacag agggcatgga gacacaaacg tccccggttg cctcagcggt 17280 

ggcggatgcc gcggtgcagg cggtcgctgc ggccgcgtcc aagacctcta cggaggtgca 17340 

aacggacccg tggatgtttc gcgtttcagc cccccggcgc ccgcgcggtt cgaggaagta 17400 

cggcgccgcc agcgcgctac tgcccgaata tgccctacat ccttccattg cgcctacccc 17460 

cggctatcgt ggctacacct accgccccag aagacgagca actacccgac gccgaaccac 17520 

cactggaacc cgccgccgcc gtcgccgtcg ccagcccgtg ctggccccga tttccgtgcg 17580 

cagggtggct cgcgaaggag gcaggaccct ggtgctgcca acagcgcgct accaccccag 17640 

catcgtttaa aagccggtct ttgtggttct tgcagatatg gccctcacct gccgcctccg 17700 

tctcccggtg ccgggattcc gaggaagaat gcaccgtagg aggggcatgg ccggccacgg 17760 

cctgacgggc ggcatgcgtc gtgcgcacca ccggcggcgg cgcgcgtcgc accgtcgcat 17820 

gcgcggcggt atcctgcccc tccttattcc actgatcgcc gcggcgattg gcgccgtgcc 17880 

cggaattgca tccgtggcct tgcaggcgca gagacactga ttaaaaacaa gttgcatgtg 17940 

gaaaaatcaa aataaaaagt ctggactctc acgctcgctt ggtcctgtaa ctattttgta 18000 

gaatggaaga catcaacttt gcgtctctgg ccccgcgaca cggctcgcgc ccgttcatgg 18060 

gaaactggca agatatcggc accagcaata tgagcggtgg cgccttcagc tggggctcgc 18120 

tgtggagcgg cattaaaaat ttcggttcca ccgttaagaa ctatggcagc aaggcctgga 18180 

acagcagcac aggccagatg ctgagggata agttgaaaga gcaaaatttc caacaaaagg 18240 

tggtagatgg cctggcctct ggcattagcg gggtggtgga cctggccaac caggcagtgc 18300 
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aaaataagat taacagtaag cttgatcccc gccctcccgt agaggagcct ccaccggccg 18360 

tggagacagt gtctccagag gggcgtggcg aaaagcgtcc gcgccccgac agggaagaaa 18420 

ctctggtgac gcaaatagac gagcctccct cgtacgagga ggcactaaag caaggcctgc 18480 

ccaccacccg tcccatcgcg cccatggcta ccggagtgct gggccagcac acacccgtaa 18540 

cgctggacct gcctcccccc gccgacaccc agcagaaacc tgtgctgcca ggcccgaccg 18600 

ccgttgttgt aacccgtcct agccgcgcgt ccctgcgccg cgccgccagc ggtccgcgat 18660 

cgttgcggcc cgtagccagt ggcaactggc aaagcacact gaacagcatc gtgggtctgg 18720 

gggtgcaatc cctgaagcgc cgacgatgct tctgaatagc taacgtgtcg tatgtgtgtc 18780 

atgtatgcgt ccatgtcgcc gccagaggag ctgctgagcc gccgcgcgcc cgctttccaa 18840 

gatggctacc ccttcgatga tgccgcagtg gtcttacatg cacatctcgg gccaggacgc 18900 

ctcggagtac ctgagccccg ggctggtgca gtttgcccgc gccaccgaga cgtacttcag 18960 

cctgaataac aagtttagaa accccacggt ggcgcctacg cacgacgtga ccacagaccg 19020 

gtcccagcgt ttgacgctgc ggttcatccc tgtggaccgt gaggatactg cgtactcgta 19080 

caaggcgcgg ttcaccctag ctgtgggtga taaccgtgtg ctggacatgg cttccacgta 19140 

ctttgacatc cgcggcgtgc tggacagggg ccctactttt aagccctact ctggcactgc 19200 

ctacaacgcc ctggctccca agggtgcccc aaatccttgc gaatgggatg aagctgctac 19260 

tgctcttgaa ataaacctag aagaagagga cgatgacaac gaagacgaag tagacgagca 19320 

agctgagcag caaaaaactc acgtatttgg gcaggcgcct tattctggta taaatattac 19380 

aaaggagggt attcaaatag gtgtcgaagg tcaaacacct aaatatgccg ataaaacatt 19440 

tcaacctgaa cctcaaatag gagaatctca gtggtacgaa actgaaatta atcatgcagc 19500 

tgggagagtc cttaaaaaga ctaccccaat gaaaccatgt tacggttcat atgcaaaacc 19560 

cacaaatgaa aatggagggc aaggcattct tgtaaagcaa caaaatggaa agctagaaag 19620 

tcaagtggaa atgcaatttt tctcaactac tgaggcgacc gcaggcaatg gtgataactt 19680 

gactcctaaa gtggtattgt acagtgaaga tgtagatata gaaaccccag acactcatat 19740 

ttcttacatg cccactatta aggaaggtaa ctcacgagaa ctaatgggcc aacaatctat 19800 

gcccaacagg cctaattaca ttgcttttag ggacaatttt attggtctaa tgtattacaa 19860 

cagcacgggt aatatgggtg ttctggcggg ccaagcatcg cagttgaatg ctgttgtaga 19920 

tttgcaagac agaaacacag agctttcata ccagcttttg cttgattcca ttggtgatag 19980 

aaccaggtac ttttctatgt ggaatcaggc tgttgacagc tatgatccag atgttagaat 20040 

tattgaaaat catggaactg aagatgaact tccaaattac tgctttccac tgggaggtgt 20100 

gattaataca gagactctta ccaaggtaaa acctaaaaca ggtcaggaaa atggatggga 20160 

aaaagatgct acagaatttt cagataaaaa tgaaataaga gttggaaata attttgccat 20220 

ggaaatcaat ctaaatgcca acctgtggag aaatttcctg tactccaaca tagcgctgta 20280 

tttgcccgac aagctaaagt acagtccttc caacgtaaaa atttctgata acccaaacac 20340 

ctacgactac atgaacaagc gagtggtggc tcccgggtta gtggactgct acattaacct 20400 

tggagcacgc tggtcccttg actatatgga caacgtcaac ccatttaacc accaccgcaa 20460 

tgctggcctg cgctaccgct caatgttgct gggcaatggt cgctatgtgc ccttccacat 20520 

ccaggtgcct cagaagttct ttgccattaa aaacctcctt ctcctgccgg gctcatacac 20580 

ctacgagtgg aacttcagga aggatgttaa catggttctg cagagctccc taggaaatga 20640 

cctaagggtt gacggagcca gcattaagtt tgatagcatt tgcctttacg ccaccttctt 20700 

ccccatggcc cacaacaccg cctccacgct tgaggccatg cttagaaacg acaccaacga 20760 

ccagtccttt aacgactatc tctccgccgc caacatgctc taccctatac ccgccaacgc 20820 

taccaacgtg cccatatcca tcccctcccg caactgggcg gctttccgcg gctgggcctt 20880 

cacgcgcctt aagactaagg aaaccccatc actgggctcg ggctacgacc cttattacac 20940 

ctactctggc tctataccct acctagatgg aaccttttac ctcaaccaca cctttaagaa 21000 

ggtggccatt acctttgact cttctgtcag ctggcctggc aatgaccgcc tgcttacccc 21060 

caacgagttt gaaattaagc gctcagttga cggggagggt tacaacgttg cccagtgtaa 21120 

catgaccaaa gactggttcc tggtacaaat gctagctaac tacaacattg gctaccaggg 21180 

cttctatatc ccagagagct acaaggaccg catgtactcc ttctttagaa acttccagcc 21240 

catgagccgt caggtggtgg atgatactaa atacaaggac taccaacagg tgggcatcct 21300 

acaccaacac aacaactctg gatttgttgg ctaccttgcc cccaccatgc gcgaaggaca 21360 

ggcctaccct gctaacttcc cctatccgct tataggcaag accgcagttg acagcattac 21420 

ccagaaaaag tttctttgcg atcgcaccct ttggcgcatc ccattctcca gtaactttat 21480 

gtccatgggc gcactcacag acctgggcca aaaccttctc tacgccaact ccgcccacgc 21540 

gctagacatg acttttgagg tggatcccat ggacgagccc acccttcttt atgttttgtt 21600 
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tgaagtcttt gacgtggtcc gtgtgcaccg gccgcaccgc ggcgtcatcg aaaccgtgta 21660 

cctgcgcacg cccttctcgg ccggcaacgc cacaacataa agaagcaagc aacatcaaca 21720; 

acagctgccg ccatgggctc cagtgagcag gaactgaaag ccattgtcaa agatcttggt 21780 

tgtgggccat attttttggg cacctatgac aagcgctttc caggctttgt ttctccacac 21840 

aagctcgcct gcgccatagt caatacggcc ggtcgcgaga ctgggggcgt acactggatg 21900 

gcctttgcct ggaacccgca ctcaaaaaca tgctacctct ttgagccctt tggcttttct 21960 

gaccagcgac tcaagcaggt ttaccagttt gagtacgagt cactcctgcg ccgtagcgcc 22020 

attgcttctt cccccgaccg ctgtataacg ctggaaaagt ccacccaaag cgtacagggg 22080 

cccaactcgg ccgcctgtgg actattctgc tgcatgtttc tccacgcctt tgccaactgg 22140 

ccccaaactc ccatggatca caaccccacc atgaacctta ttaccggggt acccaactcc 22200 

atgctcaaca gtccccaggt acagcccacc ctgcgtcgca accaggaaca gctctacagc 22260 

ttcctggagc gccactcgcc ctacttccgc agccacagtg cgcagattag gagcgccact 22320 

tctttttgtc acttgaaaaa catgtaaaaa taatgtacta gagacacttt caataaaggc 22380 

aaatgctctt atttgtacac tctcgggtga ttatttaccc ccacccttgc cgtctgcgcc 22440 

gtttaaaaat caaaggggtt ctgccgcgca tcgctatgcg ccactggcag ggacacgttg 22500 

cgatactggt gtttagtgct ccacttaaac tcaggcacaa ccatccgcgg cagctcggtg 22560 

aagttttcac tccacaggct gcgcaccatc accaacgcgt ttagcaggtc gggcgccgat 22620 

atcttgaagt cgcagttggg gcctccgccc tgcgcgcgcg agttgcgata cacagggttg 22680 

cagcactgga acactatcag cgccgggtgg tgcacgctgg ccagcacgct cttgtcggag 22740 

atcagatccg cgtccaggtc ctccgcgttg ctcagggcga acggagtcaa ctttggtagc 22800 

tgccttccca aaaagggcgc gtgcccaggc tttgagttgc actcgcaccg tagtggcatc 22860 

aaaaggtgac cgtgcccggt ctgggcgtta ggatacagcg cctgcataaa agccttgatc 22920 

tgcttaaaag ccacctgagc ctttgcgcct tcagagaaga acatgccgca agacttgccg 22980 

gaaaactgat tggccggaca ggccgcgtcg tgcacgcagc accttgcgtc ggtgttggag 23040 

atctgcacca catttcggcc ccaccggttc ttcacgatct tggccttgct agactgctcc 23100 

ttcagcgcgc gctgcccgtt ttcgctcgtc acatccattt caatcacgtg ctccttattt 23160 

atcataatgc ttccgtgtag acacttaagc tcgccttcga tctcagcgca gcggtgcagc 23220 

cacaacgcgc agcccgtggg ctcgtgatgc ttgtaggtca cctctgcaaa cgactgcagg 23280 

tacgcctgca ggaatcgccc catcatcgtc acaaaggtct tgttgctggt gaaggtcagc 23340 

' tgcaacccgc ggtgctcctc gttcagccag gtcttgcata cggccgccag agcttccact 23400 

tggtcaggca gtagtttgaa gttcgccttt agatcgttat ccacgtggta cttgtccatc 23460 

. agcgcgcgcg cagcctccat gcccttctcc cacgcagaca cgatcggcac actcagcggg 23520 

ttcatcaccg taatttcact ttccgcttcg ctgggctctt cctcttcctc ttgcgtccgc 23580 

ataccacgcg ccactgggtc gtcttcattc agccgccgca ctgtgcgctt acctcctttg 23640 

ccatgcttga ttagcaccgg tgggttgctg aaacccacca tttgtagcgc cacatcttct 23700 

ctttcttcct cgctgtccac gattacctct ggtgatggcg ggcgctcggg cttgggagaa 23760 

gggcgcttct ttttcttctt gggcgcaatg gccaaatccg ccgccgaggt cgatggccgc 23820 

gggctgggtg tgcgcggcac cagcgcgtct tgtgatgagt cttcctcgtc ctcggactcg 23880 

atacgccgcc tcatccgctt ttttgggggc gcccggggag gcggcggcga cggggacggg 23940 

gacgacacgt cctccatggt tgggggacgt cgcgccgcac cgcgtccgcg ctcgggggtg 24000 

gtttcgcgct gctcctcttc ccgactggcc atttccttct cctataggca gaaaaagatc 24060 

atggagtcag tcgagaagaa ggacagccta accgccccct ctgagttcgc caccaccgcc 24120 

tccaccgatg ccgccaacgc gcctaccacc ttccccgtcg aggcaccccc gcttgaggag 24180 

gaggaagtga ttatcgagca ggacccaggt tttgtaagcg aagacgacga ggaccgctca 24240 

gtaccaacag aggataaaaa gcaagaccag gacaacgcag aggcaaacga ggaacaagtc 24300 

gggcgggggg acgaaaggca tggcgactac ctagatgtgg gagacgacgt gctgttgaag ' 24360 

catctgcagc gccagtgcgc cattatctgc gacgcgttgc aagagcgcag cgatgtgccc 24420 

ctcgccatag cggatgtcag ccttgcctac gaacgccacc tattctcacc gcgcgtaccc 24480 

cccaaacgcc aagaaaacgg cacatgcgag cccaacccgc gcctcaactt ctaccccgta 24540 

tttgccgtgc cagaggtgct tgccacctat cacatctttt tccaaaactg caagataccc 24600 

ctatcctgcc gtgccaaccg cagccgagcg gacaagcagc tggccttgcg gcagggcgct 24660 

gtcatacctg atatcgcctc gctcaacgaa gtgccaaaaa tctttgaggg tcttggacgc 24720 

gacgagaagc gcgcggcaaa cgctctgcaa caggaaaaca gcgaaaatga aagtcactct 24780 

ggagtgttgg tggaactcga gggtgacaac gcgcgcctag ccgtactaaa acgcagcatc 24840 

gaggtcaccc actttgccta cccggcactt aacctacccc ccaaggtcat gagcacagtc 24900 
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at gagt gage 
caaacagagg 
cgcgagcctg 
gtggagcttg 
gaaacattgc 
gtggagctct 
aacgtgcttc 
tacttatttc 
gagtgeaace 
gccttcaacg 
cttaaaaccc 



aggaacttta 
gactttgtgc 
ctgeagctag 
ggtctactgg 
aattegcage 
cctgacgaaa 
taccttcgca 
caatcccgcc 
ggccaattgc 
gtttacfctgg 
tatcagcagc 
gccgccgcca 
cgaggaggag 
cgaagaggtg 
gaaateggea 
gcccgttcgc 
gcagccgccg 
gcacaagaac 
ccgctttctt 
tcatctctac 
cacagaagca 
cggcagcagc 
agcttagaaa 
aacaagagct 
acaaaagega 
actgcgcgct 
taegtcatet 
aggaaattcc 
ctgcccaaga 
gggtcaaegg 
ccacacctcg 
gtcccgctcc 
actcaggggc 
taactcacct 
cgcttggtct 
cgcctcgtca 
ttggaactct 
gacctcccgg 
eggaeggcta 
tccactgtcg 
tgcccgagga 
ttgecegtag 
gaccctgtgt 
gttgecatet 



tgatcgtgcg 
agggcctacc 
ccgacttgga 
agtgcatgca 
actacacctt 
gcaacctggt 
attccacgct 
tatgetacac 
tcaaggagct 
agcgctccgt 
tgcaacaggg 
tectagageg 
ccattaagta 
ccaactacct 
agtgtcactg 
tgettaaega 
agtccgcggc 
aatttgtacc 
cgccaaatgc 
aagccatcaa 
acccccagtc 
agccgcgggc 
cccacggacg 
gaggacatga 
tcagacgaaa 
accggttcca 
cgacccaacc 
ccgttagccc 
gccatagttg 
ctctaccatc 
agcccatact 
aaggcgaccg 
aggaggagga 
caggattttt 
gaaaataaaa 
agatcagctt 
gactcttaag 
ccagcggcca 
cacgccctac 
ctactcaacc 
aatccgcgcc 
taataacctt 
caccactgtg 
geagcttgeg 
gacaatcaga 
ccgtccggac 
ggcaatccta 
gcaatttatt 
ccactatccg 
cgactgaatg 
ccgccacaag 
tcatatcgag 
ectgattegg 
tctcactgtg 
ctgtgctgag 



ccgtgcgcag 
cgcagttggc 
ggagegaege 
gcggttcttt 
tcgacagggc 
ctcctacctt 



caagggegag 
ctggcagacg 
gcagaaactg 
ggccgcgcac 
tctgccagac 
ctcaggaatc 
ccgcgaatgc 
tgcctaccac 
tcgctgcaac 
aagtcaaatt 
tccggggttg 
tgaggactac 
ggagcttacc 
caaagcccgc 
eggegaggag 
ccttgcttcc 
aggaggaata 
tggaagactg 
caccgtcacc 
gcatggctac 
gtagatggga 
aagagcaaca 
ettgettgea 
acggcgtggc 
gcaccggcgg 
gatagcaaga 
gcgctgcgtc 
cccactctgt 
aacaggtctc 
cggcgcacgc 
gactagtttc 
cacccggcgc 
atgtggagtt 
cgaataaact 
caccgaaacc 
aatccccgta 
gtacttccca 
ggeggcttte 
gggcgaggta 
gggacatttc 
actctgeaga 
gaggagtttg 
gatcaattta 
ttaagtggag 
tgctttgccc 
ggcccggcgc 
gagtttaccc 
atttgeaact 
tataataaat 



cccctggaga 
gacgagcagc 
aaactaatga 
gctgacccgg 
tacgtacgcc 
ggaattttgc 
gcgcgccgcg 
gccatgggcg 
ctaaagcaaa 
ctggcggaca 
ttcaccagtc 
ttgcccgcca 
cctccgccgc 
tctgacataa 
ctatgcaccc 
ateggtaect 
aaactcactc 
cacgcccacg 
gcctgcgtca 
caagagtttc 
ctcaacccaa 
caggatggca 
ctgggacagt 
ggagagecta 
ctcggtcgca 
aacctccgct 
caccactgga 
acagcgccaa 
agactgtggg 
cttcccccgt 
cagcggcagc 
ctctgacaaa 
tggcgcccaa 
atgetatatt 
tgcgatccct 
tggaagaege 
gcgccctttc 
cagcacctgt 
accagccaca 
acatgagege 
gaattctctt 
gttggcccgc 
gagacgccca 
gtcacagggt 
ttcagctcaa 
agateggegg 
cctcgtcctc 
tgccatcggt 
ttcctaactt 
aggcagagca 
gcgactccgg 
acggcgtccg 
agcgccccct 
gtcctaacct 
acagaaatta 



gggatgcaaa 
tagegegctg 
tggccgcagt 
agatgeageg 
aggectgeaa 
acgaaaaccg 
actacgtccg 
tttggcagca 
acttgaagga 
tcattttccc 
aaagcatgtt 
cctgctgtgc 
tttggggcca 
tggaagacgt 
cgcaccgctc 
ttgagctgea 
cggggctgtg 
agattaggtt 
ttacccaggg 
tgctacgaaa 
tccccccgcc 
cccaaaaaga 
caggcagagg 
gacgaggaag 
ttcccctcgc 
cctcaggcgc 
accagggccg 
ggctaccgct 
ggcaacatct 
aacatcctgc 
ggcagcaaca 
geccaagaaa 
cgaacccgta 
tcaacagagc 
cacccgcagc 
ggaggctctc 
tcaaatttaa 
cgtcagcgcc 
aatgggactt 
gggaccccac 
ggaacaggcg 
tgccctggtg 
ggccgaagtt 
gcggtcgccc 
cgacgagtcg 
cgccggccgt 
tgagccgcgc 
ctactttaac 
tgacgcggta 
actgcgcctg 
tgagttttgc 
gcttaccgcc 
gctagttgag 
tggattacat 
aaatatactg 



tttgcaagaa 
gcttcaaacg 
gctcgttacc 
caagctagag 
gatctccaac 
ccttgggcaa 
egactgegtt 
gtgcttggag 
cctatggacg 
cgaacgcctg 
gcagaacttt 
acttcctagc 
ctgctacctt 
gageggtgae 
cctggtttgc 
gggtccctcg 
gaegtegget 
ctacgaagac 
ccacattctt 
gggacggggg 
gccgcagccc 
agetgeaget 
aggttttgga 
cttccgaggt 
cggcgcccca 
cgccggcact 
gtaagtccaa 
catggegegg 
ccttcgcccg 
attactaccg 
gcagcggcca 
tccacagcgg 
tcgacccgcg 
aggggecaag 
tgcctgtatc 
ttcagtaaat 
gegegaaaac 
attatgagca 
gcggctggag 
atgatatccc 
gctattacca 
taccaggaaa 
cagatgacta 
gggcagggta 
gtgagctcct 
ccttcattca 
tctggaggca 
cccttctcgg 
aaggactegg 
aaacacctgg 
tactttgaat 
cagggagagc 
egggacaggg 
caagatcttt 
gggctcctat 



25v 

250b. 

25140 

25200 

25260 

25320 

25380 

25440 

25500 

25560 

25620 

25680 

25740 

25800 

25860 

25920 

25980 

26040 

26100 

26160 

26220 

26280 

26340 

26400 

26460 

26520 

26580 

26640 

26700 

26760 

26820 

26880 

26940 

27000 

27060 

27120 

27180 

27240 

27300 

27360 

27420 

27480 

27540 

27600 

27660 

27720 

27780 

27840 

27900 

27960 

28020 

28080 

28140 

28200 
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cgccatcctg 
ggtactttta 
ctacgagaga 
tgccgggaac 
ccagactttt 
aaaaccctta 
. caactctacg 
tcttgtgatt 
tgtgcacatt 
aggtacataa 
gattttaagg 
cttataaaat 
aagtatgctg 
ttccagggta 
attaccatgt 
actggcactt 
ctctatatta 
actaagttac 
caaaaagtta 
accattcccc 
caggcttcct 
gtccaactac 
accggactta 
aacttgggca 
ctcatctgct 
ctacacccaa 
cttacagtat 
cgcttttttg 
cagccttcac 
tcactgtggt 
tcagacacca 
tatgaaattt 
gacctccaag 
ttgctacaat 
ggtgttctgc 
acgaatagat 
agttgttgcc 
tgaaatcagc 
acggaattat 
gcatgaatca 
gtctggtaaa 
acaagttgcc 
taactcagca 
atctctgcac 
aaaaaaaaat 
ttcagcagca 
aactttctcc 
actatcttca 
gtgtatccat 
gtatccccca 
cctctagtta 
gaggccggca 
aagtcaaaca 
gtggctgccg 
ccgctaaccg 



taaacgccac 

acatctctcc 

acctctccga 

gtacgagtgc 

tccggacaga 

gggtattagg 

ggctattcta 

ctctttattc 

tgcatttatt 

tcctaggttt 

agccagcctg 

gcaccacaga 

tttatgctat 

aaagtcataa 

acatgagcaa 

tctgctgcac 

aatacaaaag 

aaagctaatg 

gcattataat 

tgaacaattg 

ggatgtcagc 

agcgacccac 

catctaccac 

tgtggtggtt 

gcctaaagcg 

acaatgatgg 

gattaaatga 

tgcgtgctcc 

agtctatttg 

catcgccttt 

tccccagtac 

actgtgactt 

cctcaaagac 

gaaaaaagcg 

agtaccatct 

gccatgaacc 

ggcggctttg 

tactttaatc 

tacagagcag 

agagctccaa 

gcaggccaaa 

aaccaagcgt 

ctcggtagaa 

ccttattaag 

aataaagcat 

cctccttgcc 

acaatctaaa 

tgttgttgca 

atgacacgga 

atgggtttca 

cctccaatgg 

accttacctc 

taaacctgga 

ccgcacctct 

tgcacgactc 



cgtcttcacc 

ctctgtgatt 

gctcagctac 

gtcaccggcc 

cctcaataac 

ccaaaggcgc 

attcaggttt 

ttatactaac 

gtcagctttt 

actcaccctt 

taatgttaca 

acatgaaaag 

ttggcagcca 

aacttttatg 

acagtataag 

tgctatgcta 

cagacgcagc 

tcaccactaa 

tagaatagga 

actctatgtg 

atctgacttt 

cctaacagag 

aaatacaccc 

ctccatagcg 

caaacgcgcc 

aatccataga 

gacatgattc 

acattggctg 

ctttacggat 

atccagtgca 

agggacagga 

ttctgctgat 

atatatcatg 

atctttccga 

tagccctagc 

acccaacttt 

tcccagccaa 

taacaggagg 

cgcctgctag 

gacatggtta 

gtcacctacg 

cagaaattgg 

accgaaggct 

accctgtgcg 

cacttactta 

ctcctcccag 

tggaatgtca 

gatgaagcgc 

aaccggtcct 

agagagtccc 

catgcttgcg 

ccaaaatgta 

aatatctgca 

aatggtcgcg 

caaacttagc 



cgcccaagca 

tacaacagtt 

tccatcagaa 

gctgcaccac 

tctgtttacc 

agctactgtg 

ctctagaatc 

gcttctctgc 

taaacgctgg 

gcgtcagccc 

ttcgcagctg 

ctgcttattc 

ggtgacacta 

tatacttttc 

ttgtggcccc 

attacagtgc 

tttattgagg 

ctgctttact 

tttaaacccc 

ggatatgctc 

ggccagcacc 

atgaccaaca 

caagtttctg 

cttatgtttg 

cgaccaccca 

ttggacggac 

ctcgagtttt 

cggtttctca 

ttgtcaccct 

ttgactgggt 

ctatagctga 

tatttgcacc 

cagattcact 

agcctggtta 

tatatatccc 

ccccgcgccc 

tcagcctcgc 

agatgactga 

aaagacgcag 

acttgcacca 

acagtaatac 

tggtcatggt 

gcattcactc 

gtctcaaaga 

aaatcagtta 

ctctggtatt 

gtttcctcct 

gcaagaccgt 

ccaactgtgc 

cctggggtac 

ctcaaaatgg 

accactgtga 

cccctcacag 

ggcaacacac 

attgccaccc 



aaccaaggcg 

tcaacccaga 

aaaacaccac 

acctaccgcc 

agaacaggag 

gggtttatga 

ggggttgggg 

ctaaggctcg 

ggtcgccacc 

acggtaccac 

aagctaatga 

gccacaaaaa 

cagagtataa 

cattttatga 

cacaaaattg 

tcgctttggt 

aaaagaaaat 

cgctgcttgc 

ccggtcattt 

cagcgctaca 

tgtcccgcgg 

caaccaacgc 

cctttgtcaa 

tatgccttat 

tctatagtcc 

tgaaacacat 

tatattactg 

catcgaagta 

cacgctcatc 

ctgtgtgcgc 

gcttcttaga 

ctatctgcgt 

cgtatatgga 

tatgcaatca 

taccttgaca 

gctatgcttc 

cccacttctc 

caccctagat 

ggcagcggcc 

gtgcaaaagg 

caccggacac 

gggagaaaag 

accttgtcaa 

tcttattccc 

gcaaatttct 

gcagcttcct 

gttcctgtcc 

ctgaagatac 

cttttcttac 

tctctttgcg 

gcaacggcct 

gcccacctct 

ttacctcaga 

tcaccatgca 

aaggacccct 



aacctfcacct 

cggagtgagt 

cctccttacc 

tgaccgtaaa 

gtgagcttag 

acaattcaag 

ttattctctg 

ccgcctgctg 

caagatgatt 

ccaaaaggtg 

gtgcaccact 

caaaattggc 

tgttacagtt 

aatgtgcgac 

tgtggaaaac 

ctgtacccta 

gccttaattt 

aaaacaaatt 

cctgctcaat 

accttgaagt 

atttgttcca 

ggccgccgct 

taactgggat 

tattatgtgg 

catcattgtg 

gttcttttct 

acccttgttg 

gactgcattc 

tgcagcctca 

tttgcatatc 

attctttaat 

tttgttcccc 

atattccaag 

tctctgttat 

ttggctggaa 

cactgcaaca 

ccacccccac 

ctagaaatgg 

gagcaacagc 

ggtatctttt 

cgccttagct 

cccattacca 

ggacctgagg 

tttaactaat 

gtccagttta 

cctggctgca 

atccgcaccc 

cttcaacccc 

tcctcccttt 

cctatccgaa 

ctctctggac 

caaaaaaacc 

agccctaact 

atcacaggcc 

cacagtgtca 



28260 

28320 

28380 

28440 

28500 

28560 

28620 

28680 

28740 

28800 

28860 

28920 

28980 

29040. 

29100 

29160 

29220 

29280 

29340 

29400 

29460 

29520 

29580 

29640 

29700 

29760 

29820 

29880 

29940 

30000 

30060 

30120 

30180 

30240 

30300 

30360 

30420 

30480 

30540 

30600 

30660 

30720 

30780 

30840 

30900 

30960 

31020 

31080 

31140 

31200 

31260 

31320 

31380 

31440 

31500 
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gaaggaaagc tagccctgca aacatcaggc cccctcacca ccaccgatag cagtaccctt 31560 

actatcactg cctcaccccc tctaactact gccactggta gcttgggcat tgacttgaaa 31620 

gagcccattt atacacaaaa tggaaaacta ggactaaagt acggggctcc tttgcatgta 31680 

acagacgacc taaacacttt gaccgtagca actggtccag gtgtgactat taataatact 31740 

tccttgcaaa ctaaagttac tggagccttg ggttttgatt cacaaggcaa tatgcaactt 31800 

aatgtagcag gaggactaag gattgattct caaaacagac gccttatact tgatgttagt 31860 

tatccgtttg atgctcaaaa ccaactaaat ctaagactag gacagggccc tctttttata 31920 

aactcagccc acaacttgga tattaactac aacaaaggcc tttacttgtt tacagcttca 31980 

aacaattcca aaaagcttga ggttaaccta agcactgcca aggggttgat gtttgacgct 32040 

acagccatag ccattaatgc aggagatggg cttgaatttg gttcacctaa tgcaccaaac 32100 

acaaatcccc tcaaaacaaa aattggccat ggcctagaat ttgattcaaa caaggctatg 32160 

gttcctaaac taggaactgg ccttagtttt gacagcacag gtgccattac agtaggaaac 32220 

aaaaataatg ataagctaac tttgtggacc acaccagctc catctcctaa ctgtagacta 32280 

aatgcagaga aagatgctaa actcactttg gtcttaacaa aatgtggcag tcaaatactt 32340 

gctacagttt cagttttggc tgttaaaggc agtttggctc caatatctgg aacagttcaa 32400 

agtgctcatc ttattataag atttgacgaa aatggagtgc tactaaacaa ttccttcctg 32460 

gacccagaat attggaactt tagaaatgga gatcttactg aaggcacagc ctatacaaac 32520 

gctgttggat ttatgcctaa cctatcagct tatccaaaat ctcacggtaa aactgccaaa 32580 

agtaacattg tcagtcaagt ttacttaaac ggagacaaaa ctaaacctgt aacactaacc 32640 

attacactaa acggtacaca ggaaacagga gacacaactc caagtgcata ctctatgtca 32700 

ttttcatggg actggtctgg ccacaactac attaatgaaa tatttgccac atcctcttac 32760 

actttttcat acattgccca agaataaaga atcgtttgtg ttatgtttca acgtgtttat 32820 

ttttcaattg cagaaaattt caagtcattt ttcattcagt agtatagccc caccaccaca 32880 

tagcttatac agatcaccgt accttaatca aactcacaga accctagtat tcaacctgcc 32940 

acctccctcc caacacacag agtacacagt cctttctccc cggctggcct taaaaagcat 33000 

catatcatgg gtaacagaca tattcttagg tgttatattc cacacggttt cctgtcgagc 33060 

caaacgctca tcagtgatat taataaactc cccgggcagc tcacttaagt tcatgtcgct 33120 

gtccagctgc tgagccacag gctgctgtcc aacttgcggt tgcttaacgg gcggcgaagg 33180 

agaagtccac gcctacatgg gggtagagtc ataatcgtgc atcaggatag ggcggtggtg 33240 

ctgcagcagc gcgcgaataa actgctgccg ccgccgctcc gtcctgcagg aatacaacat 33300 

ggcagtggtc tcctcagcga tgattcgcac cgcccgcagc ataaggcgcc ttgtcctccg 33360 

ggcacagcag cgcaccctga tctcacttaa atcagcacag taactgcagc acagcaccac 33420 

aatattgttc aaaatcccac agtgcaaggc gctgtatcca aagctcatgg cggggaccac 33480 

agaacccacg tggccatcat accacaagcg caggtagatt aagtggcgac ccctcataaa 33540 

cacgctggac ataaacatta cctcttttgg catgttgtaa ttcaccacct cccggtacca 33600 

tataaacctc tgattaaaca tggcgccatc caccaccatc ctaaaccagc tggccaaaac 33660 

ctgcccgccg gctatacact gcagggaacc gggactggaa caatgacagt ggagagccca 33720 

ggactcgtaa ccatggatca tcatgctcgt catgatatca atgttggcac aacacaggca 33780 

cacgtgcata cacttcctca ggattacaag ctcctcccgc gttagaacca tatcccaggg 33840 

aacaacccat tcctgaatca gcgtaaatcc cacactgcag ggaagacctc gcacgtaact 33900 

cacgttgtgc attgtcaaag tgttacattc gggcagcagc ggatgatcct ccagtatggt 33960 

agcgcgggtt tctgtctcaa aaggaggtag acgatcccta ctgtacggag tgcgccgaga 34020 

caaccgagat cgtgttggtc gtagtgtcat gccaaatgga acgccggacg tagtcatatt 34080 

tcctgaagca aaaccaggtg cgggcgtgac aaacagatct gcgtctccgg tctcgccgct 34140 

tagatcgctc tgtgtagtag ttgtagtata tccactctct caaagcatcc aggcgccccc 34200 

tggcttcggg ttctatgtaa actccttcat gcgccgctgc cctgataaca tccaccaccg 34260 

cagaataagc cacacccagc caacctacac attcgttctg cgagtcacac acgggaggag 34320 

cgggaagagc tggaagaacc atgttttttt ttttattcca aaagattatc caaaacctca 34380 

aaatgaagat ctattaagtg aacgcgctcc cctccggtgg cgtggtcaaa ctctacagcc 34440 

aaagaacaga taatggcatt tgtaagatgt tgcacaatgg cttccaaaag gcaaacggcc 34500 

ctcacgtcca agtggacgta aaggctaaac ccttcagggt gaatctcctc tataaacatt 34560 

ccagcacctt caaccatgcc caaataattc tcatctcgcc accttctcaa tatatctcta 34620 

agcaaatccc gaatattaag tccggccatt gtaaaaatct gctccagagc gccctccacc 34680 

ttcagcctca agcagcgaat catgattgca aaaattcagg ttcctcacag acctgtataa 34740 

gattcaaaag cggaacatta acaaaaatac cgcgatcccg taggtccctt cgcagggcca 34800 



47/64 



WO 03/031588 1 



PCT/US02/32512 



gctgaacata 
tgacaaaaga 
ccccgatgta 
tcaggcaaag 
caggtaagct 
ggtttctgca 
ttacaacagg 
cgtaaaaaaa 
gagtcataat 
cgaccgaaat 
ataggaggta 
tgcctaggca 
cctaacagtc 
gcaccagctc 
actaaaaaat 
ctacgcccag 
cccacgttac 
ccgccctaaa 
accccctcat 



atcgtgcagg 
acccacactg 
agctttgttg 
cctcgcgcaa 
ccggaaccac 
taaacacaaa 
aaaaacaacc 
ctggtcaccg 
gtaagactcg 
agcccggggg 
taacaaaatt 
aaatagcacc 
agccttacca 
aatcagtcac 
gacgtaacgg 
aaacgaaagc 
gtaacttccc 
acctacgtca 
tatcatattg 



tctgcacgga 
attatgacac 
catgggcggc 
aaaagaaagc 
cacagaaaaa 
ataaaataac 
cttataagca 
tgattaaaaa 
gtaaacacat 
aatacatacc 
aataggagag 
ctcccgctcc 
gtaaaaaaga 
agtgtaaaaa 
ttaaagtcca 
caaaaaaccc 
attttaagaa 
cccgccccgt 
gcttcaatcc 



ccagcgcggc 
gcatactcgg 
gatataaaat 
acatcgtagt 
gacaccattt 
aaaaaaacat 
taagacggac 
gcaccaccga 
caggttgatt 
cgcaggcgta 
aaaaacacat 
agaacaacat 
aaacctatta 
agggccaagt 
caaaaaacac 
acaacttcct 
aactacaatt 
tcccacgccc 
aaaataaggt 



cacttccccg 
agctatgcta 
gcaaggtgct 
catgctcatg 
ttctctcaaa 
ttaaacatta 
tacggccatg 
cage tec teg 
categgtcag 
gagacaacat 
aaacacctga 
acagcgcttc 
aaaaaacacc 
geagagegag 
ccagaaaacc 
caaategtea 
cccaacacat 
cgcgccacgt 
atattattga 



ccaggaacct 
accagegtag 
gctcaaaaaa 
cagataaagg 
catgtctgcg 
gaagcctgtc 
ccggcgtgac 
gtcatgtccg 
tgctaaaaag 
tacagccccc 
aaaaccctcc 
acageggcag 
actcgacacg 
tatatatagg 
gcacgcgaac 
cttccgtttt 
acaagttact 
cacaaactcc 
tgatg 



34860 1 

34920 

34980 

35040 

35100 

35160 

35220 

35280 

35340 

35400 

35460 

35520 

35580 

35640 

35700 

35760 

35820 

35880 

35935 



<210> 9 
<211> 35935 
<212> DNA 

<213> Adenovirus serotype 5 



<400> 9 

catcatcaat 

ttgtgacgtg 

gatgttgcaa 

gtgtgcgccg 

taaatttggg 

agtgaaatct 

gactttgacc 

egggtcaaag 

tgagttcctc 

tccgacaccg 

aatggccgcc 

tcctagccat 

cgaagatccc 

gcaggaaggg 

cctttcccgg 

ccttgtaccg 

cgaggacgaa 

caggtcctgt 

ctatatgagg 

tagagtggtg 

gaattttgta 

ccagaaccgg 

cgcccgacat 

ccttctaaca 

geegtgagag 

cctgggcaac 

ttgcgtgtgt 

gagataatgt 

cgccgtgggc 



aatatacctt 
gcgcggggcg 
gtgtggcgga 
gtgtacacag 
cgtaaccgag 
gaataatttt 
gtttacgtgg 
ttggcgtttt 
aagaggecac 
ggactgaaaa 
agtcttttgg 
tttgaaccac 
aacgaggagg 
attgacttac 
cagcccgagc 
gaggtgatcg 
gagggtgagg 
cattatcacc 
acctgtggca 
ggtttggtgt 
ttgtgatttt 
agectgeaag 
cacctgtgtc 
cacctcctga 
ttggtgggcg 
ctttggactt 
ggttaacgcc 
ttaacttgea 
taatcttggt 



attttggatt 
tgggaacggg 
acacatgtaa 
gaagtgacaa 
taagatttgg 
gtgttactca 
agactcgcce 
attattatag 
tcttgagtgc 
tgagacatat 
accagctgat 
ctacccttca 
eggtttcgea 
tcacttttcc 
ageeggagea 
atcttacctg 
agtttgtgtt 
ggaggaatac 
tgtttgtcta 
ggtaattttt 
tttaaaaggt 
acctacccgc 
tagagaatgc 
gatacacccg 
tcgccaggct 
gagctgtaaa 
tttgtttgct 
tggcgtgtta 
tacatctgac 



gaagecaata 
gcgggtgacg 
gcgacggatg 
ttttcgcgcg 
ccattttcgc 
tagegegtaa 
aggtgttttt 
tcagctgacg 
cagegagtag 
tatctgccac 
egaagaggta 
cgaactgtat 
gatttttccc 
gccggcgccc 
gagagecttg 
ccacgaggct 
agattatgtg 
gggggaccca 

cagtaagtga 
tttttaattt 
cctgtgtctg 
cgtcctaaaa 
aatagtagta 
gtggtcccgc 
gtggaatgta 
cgccccaggc 
gaatgagttg 
aatggggcgg 
ctcatggagg 



tgataatgag 
tagtagtgtg 
tggcaaaagt 
gttttaggcg 
gggaaaactg 
tatttgtcta 
ctcaggtgtt 
tgtagtgtat 
agttttctcc 
ggaggtgtta 
ctggctgata 
gatttagacg 
gactctgtaa 
ggttctccgg 
ggtccggttt 
ggctttccac 
gagcaccccg 
gatattatgt 
aaattatggg 
ttacagtttt 
aacctgagcc 
tggcgcctgc 
eggatagctg 
tgtgccccat 
tcgaggactt 
cataaggtgt 
atgtaagttt 
ggcttaaagg 
cttgggagtg 



ggggtggagt 
gcggaagtgt 
gacgtttttg 
gatgttgtag 
aataagagga 
gggccgcggg 
ttccgcgttc 
ttataccegg 
tccgagccgc 
ttaccgaaga 
atcttccacc 
tgacggcccc 
tgttggcggt 
agccgcctca 
etatgecaaa 
ccagtgacga 
ggcacggttg 
gttcgctttg 
cagtgggtga 
gtggtttaaa 
tgageccgag 
tatcctgaga 
tgactccggt 
taaaccagtt 
gcttaacgag 
aaacctgtga 
aataaagggt 
gtatataatg 
tttggaagat 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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ttttctgctg tgcgtaactt gctggaacag agctctaaca gtacctcttg gttttggagg 1800 

tttctgtggg gctcatccca ggcaaagtta gtctgcagaa ttaaggagga ttacaagtgg 1860 

gaatttgaag agcttttgaa atcctgtggt gagctgtttg attctttgaa tctgggtcac 1920 

caggcgcttt tccaagagaa ggtcatcaag actttggatt tttccacacc ggggcgcgct 1980 

gcggctgctg ttgctttttt gagttttata aaggataaat ggagcgaaga aacccatctg 2040 

agcggggggt acctgctgga ttttctggcc atgcatctgt ggagagcggt tgtgagacac 2100 

aagaatcgcc tgctactgtt gtcttccgtc cgcccggcga taataccgac ggaggagcag 2160 

cagcagcagc aggaggaagc caggcggcgg cggcaggagc agagcccatg gaacccgaga 2220 

gccggcctgg accctcggga atgaatgttg tacaggtggc tgaactgtat ccagaactga 2280 

gacgcatttt gacaattaca gaggatgggc aggggctaaa gggggtaaag agggagcggg 2340 

gggcttgtga ggctacagag gaggctagga atctagcttt tagcttaatg accagacacc 2400 

gtcctgagtg tattactttt caacagatca aggataattg cgctaatgag cttgatctgc 2460 

tggcgcagaa gtattccata gagcagctga ccacttactg gctgcagcca ggggatgatt 2520 

ttgaggaggc tattagggta tatgcaaagg tggcacttag gccagattgc aagtacaaga 2580 

tcagcaaact tgtaaatatc aggaattgtt gctacatttc tgggaacggg gccgaggtgg 2640 

agatagatac ggaggatagg gtggccttta gatgtagcat gataaatatg tggccggggg 2700 

tgcttggcat ggacggggtg gttattatga atgtaaggtt tactggcccc aattttagcg 2760 

gtacggtttt cctggccaat accaacctta tcctacacgg tgtaagcttc tatgggttta 2820 

acaatacctg tgtggaagcc tggaccgatg taagggttcg gggctgtgcc ttttactgct 2880 

gctggaaggg ggtggtgtgt cgccccaaaa gcagggcttc aattaagaaa tgcctctttg 2940 

aaaggtgtac cttgggtafcc ctgtctgagg gtaactccag ggtgcgccac aatgtggcct 3000 

ccgactgtgg ttgcttcatg ctagtgaaaa gcgtggctgt gattaagcat aacatggtat 3060 

gtggcaactg cgaggacagg gcctctcaga tgctgacctg ctcggacggc aactgtcacc 3120 

tgctgaagac cattcacgta gccagccact ctcgcaaggc ctggccagtg tttgagcata 3180 

acatactgac ccgctgttcc ttgcatttgg gtaacaggag gggggtgttc ctaccttacc 3240 

aatgcaattt gagtcacact aagatattgc ttgagcccga gagcatgtcc aaggtgaacc 3300 

tgaacggggt gtttgacatg accatgaaga tctggaaggt gctgaggtac gatgagaccc 3360 

gcaccaggtg cagaccctgc gagtgtggcg gtaaacatat taggaaccag cctgtgatgc 3420 

tggatgtgac cgaggagctg aggcccgatc acttggtgct ggcctgcacc cgcgctgagt 3480 

ttggctctag cgatgaagat acagattgag gtactgaaat gtgtgggcgt ggcttaaggg 3540 

tgggaaagaa tatataaggt gggggtctta tgtagttttg tatctgtttt gcagcagccg 3600 

ccgccgccat gagcaccaac tcgtttgatg gaagcattgt gagctcatat ttgacaacgc 3660 

gcatgccccc atgggccggg gtgcgtcaga atgtgatggg ctccagcatt gatggtcgcc 3720 

ccgtcctgcc cgcaaactct actaccttga cctacgagac cgtgtctgga acgccgttgg 3780 

agactgcagc ctccgccgcc gcttcagccg ctgcagccac cgcccgcggg attgtgactg 3840 

actttgcttt cctgagcccg cttgcaagca gtgcagcttc ccgttcatcc gcccgcgatg 3900 

acaagttgac ggctcttttg gcacaattgg attctttgac ccgggaactt aatgtcgttt 3960 

ctcagcagct gttggatctg cgccagcagg tttctgccct gaaggcttcc tcccctccca 4020 

atgcggttta aaacataaat aaaaaaccag actctgtttg gatttggatc aagcaagtgt 4080 

cttgctgtct ttatttaggg gttttgcgcg cgcggtaggc ccgggaccag cggtctcggt 4140 

cgttgagggt cctgtgtatt. ttttccagga cgtggtaaag gtgactctgg atgttcagat 4200 

acatgggcat aagcccgtct ctggggtgga ggtagcacca ctgcagagct tcatgctgcg 4260 

gggtggtgtt gtagatgatc cagtcgtagc aggagcgctg ggcgtggtgc ctaaaaatgt 4320 

ctttcagtag caagctgatt gccaggggca ggcccttggt gtaagtgttt acaaagcggt 4380 

taagctggga tgggtgcata cgtggggata tgagatgcat cttggactgt atttttaggt 4440 

tggctatgtt cccagccata tccctccggg gattcatgtt gtgcagaacc accagcacag 4500 

tgtatccggt gcacttggga aatttgtcat gtagcttaga aggaaatgcg tggaagaact 4560 

tggagacgcc cttgtgacct ccaagatttt ccatgcattc gtccataatg atggcaatgg 4620 

gcccacgggc ggcggcctgg gcgaagatat ttctgggatc actaacgtca tagttgtgtt 4680 

ccaggatgag atcgtcatag gccattttta caaagcgcgg gcggagggtg ccagactgcg 4740 

gtataatggt tccatccggc ccaggggcgt agttaccctc acagatttgc atttcccacg 4800 

ctttgagttc agatgggggg atcatgtcta cctgcggggc gatgaagaaa acggtttccg 4860 

gggtagggga gatcagctgg gaagaaagca ggttcctgag cagctgcgac ttaccgcagc 4920 

cggtgggccc gtaaatcaca cctattaccg ggtgcaactg gtagttaaga gagctgcagc 4980 

tgccgtcatc cctgagcagg ggggccactt cgttaagcat gtccctgact cgcatgtttt 5040 
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ccctgaccaa atccgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag . 5100 
caaagttttt caacggtttg agaccgtccg ccgtaggcat gcttttgagc gtttgaccaa v 5160 

gcagttccag gcggtcccac agctcggtca cctgctctac ggcatctcga tccagcatat 5220 

ctcctcgttt cgcgggttgg ggcggctttc gctgtacggc agtagtcggt gctcgtccag 5280 

acgggccagg gtcatgtctt tccacgggcg cagggtcctc gtcagcgtag tctgggtcac 5340 

ggtgaagggg tgcgctccgg gctgcgcgct ggccagggtg cgcttgaggc tggtcctgct 5400 

ggtgctgaag cgctgccggt cttcgccctg cgcgtcggcc aggtagcatt tgaccatggt 5460 

gtcatagtcc agcccctccg cggcgtggcc cttggcgcgc agcttgccct tggaggaggc 5520 

gccgcacgag gggcagtgca gacttttgag ggcgtagagc ttgggcgcga gaaataccga 5580 

ttccggggag taggcatccg cgccgcaggc cccgcagacg gtctcgcatt ccacgagcca 5640 

ggtgagctct ggccgttcgg ggtcaaaaac caggtttccc ccatgctttt tgatgcgttt 5700 

cttacctctg gtttccatga gccggtgtcc acgctcggtg acgaaaaggc tgtccgtgtc 5760 

cccgtataca gacttgagag gcctgtcctc gagcggtgtt ccgcggtcct cctcgtatag 5820 

aaactcggac cactctgaga caaaggctcg cgtccaggcc agcacgaagg aggctaagtg 5880 

ggaggggtag cggtcgttgt ccactagggg gtccactcgc tccagggtgt gaagacacat 5940 

gtcgccctct tcggcatcaa ggaaggtgat tggtttgtag gtgtaggcca cgtgaccggg 6000 

tgttcctgaa ggggggctat aaaagggggt gggggcgcgt tcgtcctcac tctcttccgc 6060 

atcgctgtct gcgagggcca gctgttgggg tgagtactcc ctctgaaaag cgggcatgac 6120 

ttctgcgcta agattgtcag tttccaaaaa cgaggaggat ttgatattca cctggcccgc 6180 

ggtgatgcct ttgagggtgg ccgcatccat ctggtcagaa aagacaatct ttttgttgtc 6240 
aagcttggtg gcaaacgacc cgtagagggc gttggacagc aacttggcga tggagcgcag . 6300 

ggtttggttt ttgtcgcgat cggcgcgctc cttggccgcg atgtttagct gcacgtattc 6360 

gcgcgcaacg caccgccatt cgggaaagac ggtggtgcgc tcgtcgggca ccaggtgcac 6420 

. gcgccaaccg cggttgtgca gggtgacaag gtcaacgctg gtggctacct ctccgcgtag 6480 

gcgctcgttg gtccagcaga ggcggccgcc cttgcgcgag cagaatggcg gtagggggtc 6540 

tagctgcgtc tcgtccgggg ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc 6600 

gtcgaagtag tctatcttgc atccttgcaa gtctagcgcc tgctgccatg cgcgggcggc 6660 

aagcgcgcgc tcgtatgggt tgagtggggg accccatggc atggggtggg tgagcgcgga 6720 

ggcgtacatg ccgcaaatgt cgtaaacgta gaggggctct ctgagtattc caagatatgt 6780 

agggtagcat cttccaccgc ggatgctggc gcgcacgtaa tcgtatagtt cgtgcgaggg 6840 

agcgaggagg tcgggaccga ggttgctacg ggcgggctgc tctgctcgga agactatctg 6900 

cctgaagatg gcatgtgagt tggatgatat ggttggacgc tggaagacgt tgaagctggc 6960 

gtctgtgaga cctaccgcgt cacgcacgaa ggaggcgtag gagtcgcgca gcttgttgac 7020 

cagctcggcg gtgacctgca cgtctagggc gcagtagtcc agggtttcct tgatgatgtc 7080 

atacttatcc tgtccctttt ttttccacag ctcgcggttg aggacaaact cttcgcggtc 7140 

tttccagtac tcttggatcg gaaacccgtc ggcctccgaa cggtaagagc ctagcatgta 7200 

gaactggttg acggcctggt aggcgcagca tcccttttct acgggtagcg cgtatgcctg 7260 

cgcggccttc cggagcgagg tgtgggtgag cgcaaaggtg tccctgacca tgactttgag 7320 

gtactggtat ttgaagtcag tgtcgtcgca tccgccctgc tcccagagca aaaagtccgt 7380 

gcgctttttg gaacgcggat ttggcagggc gaaggtgaca tcgttgaaga gtatctttcc 7440 

cgcgcgaggc ataaagttgc gtgtgatgcg gaagggtccc ggcacctcgg aacggttgtt 7500 

aattacctgg gcggcgagca cgatctcgtc aaagccgttg atgttgtggc ccacaatgta 7560 

aagttccaag aagcgcggga tgcccttgat ggaaggcaat tttttaagtt cctcgtaggt 7620 

gagctcttca ggggagctga gcccgtgctc tgaaagggcc cagtctgcaa gatgagggtt 7680 

ggaagcgacg aatgagctcc acaggtcacg ggccattagc atttgcaggt ggtcgcgaaa 7740 

ggtcctaaac tggcgaccta tggccatttt ttctggggtg atgcagtaga aggtaagcgg 7800 

gtcttgttcc cagcggtccc atccaaggtt cgcggctagg tctcgcgcgg cagtcactag 7860 

aggctcatct ccgccgaact tcatgaccag catgaagggc acgagctgct tcccaaaggc 7920 

ccccatccaa gtataggtct ctacatcgta ggtgacaaag agacgctcgg tgcgaggatg 7980 

cgagccgatc gggaagaact ggatctcccg ccaccaattg gaggagtggc tattgatgtg 8040 

gtgaaagtag aagtccctgc gacgggccga acactcgtgc tggcttttgt aaaaacgtgc 8100 

gcagtactgg cagcggtgca cgggctgtac atcctgcacg aggttgacct gacgaccgcg 8160 

cacaaggaag cagagtggga atttgagccc ctcgcctggc gggtttggct ggtggtcttc 8220 

tacttcggct gcttgtcctt gaccgtctgg ctgctcgagg ggagttacgg tggatcggac 8280 

caccacgccg cgcgagccca aagtccagat gtccgcgcgc ggcggtcgga gcttgatgac 8340 
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aacatcgcgc agatgggagc tgtccatggt ctggagctcc cgcggcgtca ggtcaggcgg 8400 

gagctcctgc aggtttacct cgcatagacg ggtcagggcg cgggctagat ccaggtgata 8460 

cctaatttcc aggggctggt tggtggcggc gtcgatggct tgcaagaggc cgcatccccg 8520 

cggcgcgact acggtaccgc gcggcgggcg gtgggccgcg ggggtgtcct tggatgatgc 8580 

atctaaaagc ggtgacgcgg gcgagccccc ggaggtaggg ggggctccgg acccgccggg 8640 

agagggggca ggggcacgtc ggcgccgcgc gcgggcagga gctggtgctg cgcgcgtagg 8700 

ttgctggcga acgcgacgac gcggcggttg atctcctgaa tctggcgcct ctgcgtgaag 8760 

acgacgggcc cggtgagctt gagcctgaaa gagagttcga cagaatcaat ttcggtgtcg 8820 

ttgacggcgg cctggcgcaa aatctcctgc acgtctcctg agttgtcttg ataggcgatc 8880 

tcggccatga actgctcgat ctcttcctcc tggagatctc cgcgtccggc tcgctccacg 8940 

gtggcggcga ggtcgttgga aatgcgggcc atgagctgcg agaaggcgtt gaggcctccc 9000 

tcgttccaga cgcggctgta gaccacgccc ccttcggcat cgcgggcgcg catgaccacc 9060 

tgcgcgagat tgagctccac gtgccgggcg aagacggcgt agtttcgcag gcgctgaaag 9120 

aggtagttga gggtggtggc ggtgtgttct gccacgaaga agtacataac ccagcgtcgc 9180 

aacgtggatt cgttgatatc ccccaaggcc tcaaggcgct ccatggcctc gtagaagtcc 9240 

acggcgaagt tgaaaaactg ggagttgcgc gccgacacgg ttaactcctc ctccagaaga 9300 

cggatgagct cggcgacagt gtcgcgcacc tcgcgctcaa aggctacagg ggcctcttct 9360 

tcttcttcaa tctcctcttc cataagggcc tccccttctt cttcttctgg cggcggtggg 9420 

ggagggggga cacggcggcg acgacggcgc accgggaggc ggtcgacaaa gcgctcgatc 9480 

atctccccgc ggcgacggcg catggtctcg gtgacggcgc ggccgttctc gcgggggcgc 9540 

agttggaaga cgccgcccgt catgtcccgg ttatgggttg gcggggggct gccatgcggc 9600 

agggatacgg cgctaacgat gcatctcaac aattgttgtg taggtactcc gccgccgagg 9660 

gacctgagcg agtccgcatc gaccggatcg gaaaacctct cgagaaaggc gtctaaccag 9720 

tcacagtcgc aaggtaggct gagcaccgtg gcgggcggca gcgggcggcg gtcggggttg 9780 

tttctggcgg aggtgctgct gatgatgtaa ttaaagtagg cggtcttgag acggcggatg 9840 

gtcgacagaa gcaccatgtc cttgggtccg gcctgctgaa tgcgcaggcg gtcggccatg 9900 

ccccaggctt cgttttgaca tcggcgcagg tctttgtagt agtcttgcat gagcctttct 9960 

accggcactt cttcttctcc ttcctcttgt cctgcatctc ttgcatctat cgctgcggcg 10020 

gcggcggagt ttggccgtag gtggcgccct cttcctccca tgcgtgtgac cccgaagccc 10080 

ctcatcggct gaagcagggc taggtcggcg acaacgcgct cggctaatat ggcctgctgc 10140 

acctgcgtga gggtagactg gaagtcatcc atgtccacaa agcggtggta tgcgcccgtg 10200 

ttgatggtgt aagtgcagtt ggccataacg gaccagttaa cggtctggtg acccggctgc 10260 

gagagctcgg tgtacctgag acgcgagtaa gccctcgagt caaatacgta gtcgttgcaa 10320 

gtccgcacca ggtactggta tcccaccaaa aagtgcggcg gcggctggcg gtagaggggc 10380 

cagcgtaggg tggccggggc tccgggggcg agatcttcca acataaggcg atgatatccg 10440 

tagatgtacc tggacatcca ggtgatgccg gcggcggtgg tggaggcgcg cggaaagtcg 10500 

cggacgcggt tccagatgtt gcgcagcggc aaaaagtgct ccatggtcgg gacgctctgg 10560 

ccggtcaggc gcgcgcaatc gttgacgctc tagaccgtgc aaaaggagag cctgtaagcg 10620 

ggcactcttc cgtggtctgg tggataaatt cgcaagggta tcatggcgga cgaccggggt 10680 

tcgagccccg tatccggccg tccgccgtga tccatgcggt taccgcccgc gtgtcgaacc 10740 

caggtgtgcg acgtcagaca acgggggagt gctccttttg gcttccttcc aggcgcggcg 10800 

gctgctgcgc tagctttttt ggccactggc cgcgcgcagc gtaagcggtt aggctggaaa 10860 

gcgaaagcat taagtggctc gctccctgta gccggagggt tattttccaa gggttgagtc 10920 

gcgggacccc cggttcgagt ctcggaccgg ccggactgcg gcgaacgggg gtttgcctcc 10980 

ccgtcatgca agaccccgct tgcaaattcc tccggaaaca gggacgagcc ccttttttgc 11040 

ttttcccaga tgcatccggt gctgcggcag atgcgccccc ctcctcagca gcggcaagag 11100 

caagagcagc ggcagacatg cagggcaccc tcccctcctc ctaccgcgtc aggaggggcg 11160 

acatccgcgg ttgacgcggc agcagatggt gattacgaac ccccgcggcg ccgggcccgg 11220 

cactacctgg acttggagga gggcgagggc ctggcgcggc taggagcgcc ctctcctgag 11280 

cggtacccaa gggtgcagct gaagcgtgat acgcgtgagg cgtacgtgcc gcggcagaac 11340 

ctgtttcgcg accgcgaggg agaggagccc gaggagatgc gggatcgaaa gttccacgca 11400 

gggcgcgagc tgcggcatgg cctgaatcgc gagcggttgc tgcgcgagga ggactttgag 11460 

cccgacgcgc gaaccgggat tagtcccgcg cgcgcacacg tggcggccgc cgacctggta 11520 

accgcatacg agcagacggt gaaccaggag attaactttc aaaaaagctt taacaaccac 11580 

gtgcgtacgc ttgtggcgcg cgaggaggtg gctataggac tgatgcatct gtgggacttt 11640 
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gtaagcgcgc tggagcaaaa cccaaatagc aagccgctca tggcgcagct gttccttata 11700 

gtgcagcaca gcagggacaa cgaggcattc agggatgcgc tgctaaacat agtagagccc 11760 

gagggccgct ggctgctcga tttgataaac atcctgcaga gcatagcggt gcaggagcgc 11820 

agcttgagcc tggctgacaa ggtggccgcc atcaactatt ccatgcttag cctgggcaag 11880 

ttttacgccc gcaagatata ccatacccct tacgttccca tagacaagga ggtaaagatc 11940 

gaggggttct acatgcgcat ggcgctgaag gtgcttacct tgagcgacga cctgggcgtt 12000 

tatcgcaacg agcgcatcca caaggccgtg agcgtgagcc ggcggcgcga gctcagcgac 12060 
cgcgagctga tgcacagcct gcaaagggcc ctggctggca cgggcagcgg cgatagagag . 12120 

gccgagtcct actttgacgc gggcgctgac ctgcgctggg ccccaagccg acgcgccctg 12180 

gaggcagctg gggccggacc tgggctggcg gtggcacccg cgcgcgctgg caacgtcggc 12240 

. ggcgtggagg aatatgacga ggacgatgag tacgagccag aggacggcga gtactaagcg 12300 

gtgatgtttc tgatcagatg atgcaagacg caacggaccc ggcggtgcgg gcggcgctgc 12360 

agagccagcc gtccggcctt aactccacgg acgactggcg ccaggtcatg gaccgcatca 12420 

tgtcgctgac tgcgcgcaat cctgacgcgt tccggcagca gccgcaggcc aaccggctct 12480 

ccgcaattct ggaagcggtg gtcccggcgc gcgcaaaccc cacgcacgag aaggtgctgg 12540 

cgatcgtaaa cgcgctggcc gaaaacaggg ccatccggcc cgacgaggcc ggcctggtct 12600 

acgacgcgct gcttcagcgc gtggctcgtt acaacagcgg caacgtgcag accaacctgg 12660 

accggctggt gggggatgtg cgcgaggccg tggcgcagcg tgagcgcgcg cagcagcagg 12720 

gcaacctggg ctccatggtt gcactaaacg ccttcctgag tacacagccc gccaacgtgc 12780 

cgcggggaca ggaggactac accaactttg tgagcgcact gcggctaatg gtgactgaga 12840 

caccgcaaag cgaggtgtac cagtctgggc cagactattt tttccagacc agtagacaag 12900 

gcctgcagac cgtaaacctg agccaggctt tcaaaaactt gcaggggctg tggggggtgc 12960 

gggctcccac aggcgaccgc gcgaccgtgt ctagcttgct gacgcccaac tcgcgcctgt 13020 

tgctgctgct aatagcgccc ttcacggaca gtggcagcgt gtcccgggac acatacctag 13080 

gtcacttgct gacactgtac cgcgaggcca taggtcaggc gcatgtggac gagcatactt 13140 

tccaggagat tacaagtgtc agccgcgcgc tggggcagga ggacacgggc agcctggagg 13200 

caaccctaaa ctacctgctg accaaccggc ggcagaagat cccctcgttg cacagtttaa 13260 

acagcgagga ggagcgcatt ttgcgctacg tgcagcagag cgtgagcctt aacctgatgc 13320 

gcgacggggt aacgcccagc gtggcgctgg acatgaccgc gcgcaacatg gaaccgggca 13380 

tgtatgcctc aaaccggccg tttatcaacc gcctaatgga ctacttgcat cgcgcggccg 13440 

ccgtgaaccc cgagtatttc accaatgcca tcttgaaccc gcactggcta ccgccccctg 13500 

gtttctacac cgggggattc gaggtgcccg agggtaacga tggattcctc tgggacgaca 13560 

tagacgacag cgtgttttcc ccgcaaccgc agaccctgct agagttgcaa cagcgcgagc 13620 

aggcagaggc ggcgctgcga aaggaaagct tccgcaggcc aagcagcttg tccgatctag 13680 

gcgctgcggc cccgcggtca gatgctagta gcccatttcc aagcttgata gggtctctta 13740 

cbagcactcg caccacccgc ccgcgcctgc tgggcgagga ggagtaccta aacaactcgc 13800 

tgctgcagcc gcagcgcgaa aaaaacctgc ctccggcatt tcccaacaac gggatagaga 13860 

gcctagtgga caagatgagt agatggaaga cgtacgcgca ggagcacagg gacgtgccag 13920 

gcccgcgccc gcccacccgt cgtcaaaggc acgaccgtca gcggggtctg gtgtgggagg 13980 

acgatgactc ggcagacgac agcagcgtcc tggatttggg agggagtggc aacccgtttg 14040 

cgcaccttcg ccccaggctg gggagaatgt tttaaaaaaa aaaaagcatg atgcaaaata 14100 

aaaaactcac caaggccatg gcaccgagcg ttggttttct tgtattcccc ttagtatgcg 14160 

gcgcgcggcg atgtatgagg aaggtcctcc tccctcctac gagagtgtgg tgagcgcggc 14220 

gccagtggcg gcggcgctgg gttctccctt cgatgctccc ctggacccgc cgtttgtgcc 14280 

tccgcggtac ctgcggccta ccggggggag aaacagcatc cgttactctg agttggcacc 14340 

cctattcgac accacccgtg tgtacctggt ggacaacaag tcaacggatg tggcatccct 14400 

gaactaccag aacgaccaca gcaactttct gaccacggtc attcaaaaca atgactacag 14460 

• cccgggggag gcaagcacac agaccatcaa tcttgacgac cggtcgcact ggggcggcga 14520 

cctgaaaacc atcctgcata ccaacatgcc aaatgtgaac gagttcatgt ttaccaataa 14580 

gtttaaggcg cgggtgatgg tgtcgcgctt gcctactaag gacaatcagg tggagctgaa 14640 

atacgagtgg gtggagttca cgctgcccga gggcaactac tccgagacca tgaccataga 14700 

ccttatgaac aacgcgatcg tggagcacta cttgaaagtg ggcagacaga acggggttct 14760 

ggaaagcgac atcggggtaa agtttgacac ccgcaacttc agactggggt ttgaccccgt 14820 

cactggtcct gtcatgcctg gggtatatac aaacgaagcc ttccatccag acatcatttt 14880 

gctgccagga tgcggggtgg acttcaccca cagccgcctg agcaacttgt tgggcatccg 14940 
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caagcggcaa cccttccagg agggctttag gatcacctac gatgatctgg agggtggtaa 15000 

cattcccgca ctgttggatg tggacgccta ccaggcgagc ttgaaagatg acaccgaaca 15060 

gggcgggggt ggcgcaggcg gcagcaacag cagtggcagc ggcgcggaag agaactccaa 15120 

cgcggcagcc gcggcaatgc agccggtgga ggacatgaac gatcatgcca ttcgcggcga 15180 

cacctttgcc acacgggctg aggagaagcg cgctgaggcc gaagcagcgg ccgaagctgc 15240 

cgcccccgct gcgcaacccg aggtcgagaa gcctcagaag aaaccggtga tcaaacccct 15300 

gacagaggac agcaagaaac gcagttacaa cctaataagc aatgacagca ccttcaccca 15360 

gtaccgcagc tggtaccttg catacaacta cggcgaccct cagaccggaa tccgctcatg 15420 

gaccctgctt tgcactcctg acgtaacctg cggctcggag caggtctact ggtcgttgcc 15480 

agacatgatg caagaccccg tgaccttccg ctccacgcgc cagatcagca actttccggt 15540 

ggtgggcgcc gagctgttgc ccgtgcactc caagagcttc tacaacgacc aggccgtcta 15600 

ctcccaactc atccgccagt ttacctctct gacccacgtg ttcaatcgct ttcccgagaa 15660 

ccagattttg gcgcgcccgc cagcccccac catcaccacc gtcagtgaaa acgttcctgc 15720 

tctcacagat cacgggacgc taccgctgcg caacagcatc ggaggagtcc agcgagtgac 15780 

cattactgac gccagacgcc gcacctgccc ctacgtttac aaggccctgg gcatagtctc 15840 

gccgcgcgtc ctatcgagcc gcactttttg agcaagcatg tccatcctta tatcgcccag 15900 

caataacaca ggctggggcc tgcgcttccc aagcaagatg tttggcgggg ccaagaagcg 15960 

ctccgaccaa cacccagtgc gcgtgcgcgg gcactaccgc gcgccctggg gcgcgcacaa 16020 

acgcggccgc actgggcgca ccaccgtcga tgacgccatc gacgcggtgg tggaggaggc 16080 

gcgcaactac acgcccacgc cgccaccagt gtccacagtg gacgcggcca ttcagaccgt 16140 

ggtgcgcgga gcccggcgct atgctaaaat gaagagacgg cggaggcgcg tagcacgtcg 16200 

ccaccgccgc cgacccggca ctgccgccca acgcgcggcg gcggccctgc ttaaccgcgc 16260 

acgtcgcacc ggccgacggg cggccatgcg ggccgctcga aggctggccg cgggtattgt 16320 

cactgtgccc cccaggtcca ggcgacgagc ggccgccgca gcagccgcgg ccattagtgc 16380 

tatgactcag ggtcgcaggg gcaacgtgta ttgggtgcgc gactcggtta gcggcctgcg 16440 

cgtgcccgtg cgcacccgcc ccccgcgcaa ctagattgca agaaaaaact acttagactc 16500 

gtactgttgt atgtatccag cggcggcggc gcgcaacgaa gctatgtcca agcgcaaaat 16560 

caaagaagag atgctccagg tcatcgcgcc ggagatctat ggccccccga agaaggaaga 16620 

gcaggattac aagccccgaa agctaaagcg ggtcaaaaag aaaaagaaag atgatgatga 16680 

tgaacttgac gacgaggtgg aactgctgca cgctaccgcg cccaggcgac gggtacagtg 16740 

gaaaggtcga cgcgtaaaac gtgttttgcg acccggcacc accgtagtct ttacgcccgg 16800 

tgagcgctcc acccgcacct acaagcgcgt gtatgatgag gtgtacggcg acgaggacct 16860 

gcttgagcag gccaacgagc gcctcgggga gtttgcctac ggaaagcggc ataaggacat 16920 

gctggcgttg ccgctggacg agggcaaccc aacacctagc ctaaagcccg taacactgca 16980 

gcaggtgctg cccgcgcttg caccgtccga agaaaagcgc ggcctaaagc gcgagtctgg 17040 

tgacttggca cccaccgtgc agctgatggt acccaagcgc cagcgactgg aagatgtctt 17100 

ggaaaaaatg accgtggaac ctgggctgga gcccgaggtc cgcgtgcggc caatcaagca 17160 

ggtggcgccg ggactgggcg tgcagaccgt ggacgttcag atacccacta ccagtagcac 17220 

cagtattgcc accgccacag agggcatgga gacacaaacg tccccggttg cctcagcggt 17280 

ggcggatgcc gcggtgcagg cggtcgctgc ggccgcgtcc aagacctcta cggaggtgca 17340 

aacggacccg tggatgtttc gcgtttcagc cccccggcgc ccgcgcggtt cgaggaagta 17400 

cggcgccgcc agcgcgctac tgcccgaata tgccctacat ccttccattg cgcctacccc 17460 

cggctatcgt ggctacacct accgccccag aagacgagca actacccgac gccgaaccac 17520 

cactggaacc cgccgccgcc gtcgccgtcg ccagcccgtg ctggccccga tttccgtgcg 17580 

cagggtggct cgcgaaggag gcaggaccct ggtgctgcca acagcgcgct accaccccag 17640 

catcgtttaa aagccggtct ttgtggttct tgcagatatg gccctcacct gccgcctccg 17700 

tttcccggtg ccgggattcc gaggaagaat gcaccgtagg aggggcatgg ccggccacgg 17760 

cctgacgggc ggcatgcgtc gtgcgcacca ccggcggcgg cgcgcgtcgc accgtcgcat 17820 

gcgcggcggt atcctgcccc tccttattcc actgatcgcc gcggcgattg gcgccgtgcc 17880 

cggaattgca tccgtggcct tgcaggcgca gagacactga ttaaaaacaa gttgcatgtg 17940 

gaaaaatcaa aataaaaagt ctggactctc acgctcgctt ggtcctgtaa ctattttgta 18000 

gaatggaaga catcaacttt gcgtctctgg ccccgcgaca cggctcgcgc ccgttcatgg 18060 

gaaactggca agatatcggc accagcaata tgagcggtgg cgccttcagc tggggctcgc 18120 

tgtggagcgg cattaaaaat ttcggttcca ccgttaagaa ctatggcagc aaggcctgga 18180 

acagcagcac aggccagatg ctgagggata agttgaaaga gcaaaatttc caacaaaagg 18240 
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tggtagatgg cctggcctct ggcattagcg gggtggtgga cctggccaac caggcagtgc 18300 

aaaataagat taacagtaag cttgatcccc gccctcccgt agaggagcct ccaccggccg : 18360 

tggagacagt gtctccagag gggcgtggcg aaaagcgtcc gcgccccgac agggaagaaa 18420 

ctctggtgac gcaaatagac gagcctccct cgtacgagga ggcactaaag caaggcctgc 18480 

ccaccacccg tcccatcgcg cccatggcta ccggagtgct gggccagcac acacccgtaa 18540 

cgctggacct gcctcccccc gccgacaccc agcagaaacc tgtgctgcca ggcccgaccg 18600 

ccgttgttgt aacccgtcct agccgcgcgt ccctgcgccg cgccgccagc ggtccgcgat 18660 

cgttgcggcc cgtagccagt ggcaactggc aaagcacact gaacagcatc gtgggtctgg 18720 

gggtgcaatc cctgaagcgc cgacgatgct tctgaatagc taacgtgtcg. tatgtgtgtc 18780 

atgtatgcgt ccatgtcgcc gccagaggag ctgctgagcc gccgcgcgcc cgctttccaa 18840 

gatggctacc ccttcgatga tgccgcagtg gtcttacatg cacatctcgg gccaggacgc 18900 

ctcggagtac ctgagccccg ggctggtgca gtttgcccgc gccaccgaga cgtacttcag .18960 

cctgaataac aagtttagaa accccacggt ggcgcctacg cacgacgtga ccacagaccg 19020 

gtcccagcgt ttgacgctgc ggttcatccc tgtggaccgt gaggatactg cgtactcgta 19080 

caaggcgcgg ttcaccctag ctgtgggtga taaccgtgtg ctggacatgg cttccacgta 19140 

ctttgacatc cgcggcgtgc tggacagggg ccctactttt aagccctact ctggcactgc 19200 

ctacaacgcc ctggctccca agggtgcccc aaatccttgc gaatgggatg aagctgctac 19260. 

tgctcttgaa ataaacctag aagaagagga cgatgacaac gaagacgaag tagacgagca 19320 

agctgagcag caaaaaactc acgtatttgg gcaggcgcct tattctggta taaatattac 19380 

aaaggagggt attcaaatag gtgtcgaagg tcaaacacct aaatatgccg ataaaacatt 19440 

tcaacctgaa cctcaaatag gagaatctca gtggtacgaa actgaaatta atcatgcagc 19500 

tgggagagtc cttaaaaaga ctaccccaat gaaaccatgt tacggttcat atgcaaaacc 19560 

cacaaatgaa aatggagggc aaggcattct tgtaaagcaa caaaatggaa agctagaaag 19620 

tcaagtggaa atgcaatttt tctcaactac tgaggcgacc gcaggcaatg gtgataactt 19680 

gactcctaaa gtggtattgt acagtgaaga tgtagatata gaaaccccag acactcatat 19740 

ttcttacatg cccactatta aggaaggtaa ctcacgagaa ctaatgggcc aacaatctat 19800 

gcccaacagg cctaattaca ttgcttttag ggacaatttt attggtctaa tgtattacaa 19860 

cagcacgggt aatatgggtg ttctggcggg ccaagcatcg cagttgaatg ctgttgtaga 19920 

tttgcaagac agaaacacag agctttcata ccagcttttg cttgattcca ttggtgatag 19980 

aaccaggtac ttttctatgt ggaatcaggc tgttgacagc tatgatccag atgttagaat 20040 

tattgaaaat catggaactg aagatgaact tccaaattac tgctttccac tgggaggtgt 20100 

gattaataca gagactctta ccaaggtaaa acctaaaaca ggtcaggaaa atggatggga 20160 

aaaagatgct acagaatttt cagataaaaa tgaaataaga gttggaaata attttgccat 20220 

ggaaatcaat ctaaatgcca acctgtggag aaatttcctg tactccaaca tagcgctgta 20280 

tttgcccgac aagctaaagt acagtccttc caacgtaaaa atttctgata acccaaacac 20340 

ctacgactac atgaacaaflgc gagtggtggc tcccgggtta gtggactgct acattaacct 20400 

tggagcacgc tggtcccttg actatatgga caacgtcaac ccatttaacc accaccgcaa 20460 

tgctggcctg cgctaccgct caatgfcfegct gggcaatggt cgctatgtgc ccttccacat 20520 

ccaggtgcct cagaagttct ttgccattaa aaacctcctt ctcctgccgg gctcatacac 20580 

ctacgagtgg aacttcagga aggatgttaa catggttctg cagagctccc taggaaatga 20640 

cctaagggtt gacggagcca gcattaagtt tgatagcatt tgcctttacg ccaccttctt 20700 

ccccatggcc cacaacaccg cctccacgct tgaggccatg cttagaaacg acaccaacga 20760 

ccagtccttt aacgactatc tctccgccgc caacatgctc taccctatac ccgccaacgc 20820 

taccaacgtg cccatatcca tcccctcccg caactgggcg gctttccgcg gctgggcctt . 20880 

cacgcgcctt aagactaagg aaaccccatc actgggctcg ggctacgacc cttattacac 20940 

ctactctggc tctataccct acctagatgg aaccttttac ctcaaccaca cctttaagaa 21000 

ggtggccatt acctttgact cttctgtcag ctggcctggc aatgaccgcc tgcttacccc 21060 

caacgagttt gaaattaagc gctcagttga cggggagggt tacaacgttg cccagtgtaa 21120 

catgaccaaa gactggttcc tggtacaaat gctagctaac tacaacattg gctaccaggg 21180 

cttctatatc ccagagagct acaaggaccg catgtactcc ttctttagaa acttccagcc 21240 

catgagccgt caggtggtgg atgatactaa atacaaggac taccaacagg tgggcatcct 21300 

acaccaacac aacaactctg gatttgttgg ctaccttgcc cccaccatgc gcgaaggaca 21360 

ggcctaccct gctaacttcc cctatccgct tataggcaag accgcagttg acagcattac 21420 

ccagaaaaag tttctttgcg atcgcaccct ttggcgcatc ccattctcca gtaactttat 21480 

gtccatgggc gcactcacag acctgggcca aaaccttctc tacgccaact ccgcccacgc 21540 
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gctagacatg acttttgagg tggatcccat ggacgagccc acccttcttt atgttttgtt 21600 

tgaagtcttt gacgtggtcc gtgtgcaccg gccgcaccgc ggcgtcatcg aaaccgtgta 21660 

cctgcgcacg cccttctcgg ccggcaacgc cacaacataa agaagcaagc aacatcaaca 21720 

acagctgccg ccatgggctc cagtgagcag gaactgaaag ccattgtcaa agate ttggt 21780 

tgtgggccat attttttggg cacctatgac aagegcttte caggctttgt ttctccacac 21840 

aagctcgcct gcgccatagt caatacggcc ggtcgegaga ctgggggcgt acactggatg 21900 

gcctttgcct ggaacccgca ctcaaaaaca tgctacctct ttgagecett tggcttttct 21960 

gaccagcgac tcaagcaggt ttaccagttt gagtacgagt cactcctgcg ccgtagcgcc 2202O 

attgettett cccccgaccg ctgtataacg ctggaaaagt ccacccaaag cgtacagggg 22080 

cccaactcgg ccgcctgtgg actattctgc tgcatgtttc tccacgcctt tgccaactgg 22140 

ccccaaactc ccatggatca caaccccacc atgaacctta ttaccggggt acccaactcc 22200 

atgetcaaca gtccccaggt acagcccacc ctgcgtcgca accaggaaca gctctacagc 22260 

ttcctggagc gccactcgcc ctacttccgc agecacagtg cgcagattag gagcgccact 22320 

tctttttgtc acttgaaaaa catgtaaaaa taatgtacta gagacacttt caataaaggc 22380 

aaatgctttt atttgtacac tctcgggtga ttatttaccc ccacccttgc cgtctgcgcc 22440 

gtttaaaaat caaaggggtt ctgccgcgca tegctatgeg ccactggcag ggacacgttg 22500 

cgatactggt gtttagtgct ccacttaaac tcaggcacaa ccatccgcgg cagcteggtg 22560 

aagttttcac tccacaggct gcgcaccatc accaacgcgt ttagcaggtc gggegecgat 22620 

atcttgaagt cgcagttggg gcctccgccc tgcgcgcgcg agttgcgata cacagggttg 22680 

cagcactgga acactatcag cgccgggtgg tgcacgctgg ccagcacgct ettgteggag 22740 

atcagatccg cgtccaggtc ctccgcgttg etcagggega aeggagtcaa ctttggtagc 22800 

tgccttccca aaaagggege gtgcccaggc tttgagttgc actcgcaccg tagtggcatc 22860 

aaaaggtgac cgtgcccggt ctgggcgtta ggatacagcg ectgeataaa agecttgate 22920 

tgcttaaaag ccacctgagc ctttgcgcct tcagagaaga acatgccgca agacttgecg 22980 

gaaaactgat tggceggaca ggccgcgtcg tgcacgcagc accttgcgtc ggtgttggag 23040 

atctgcacca catttcggcc ccaccggttc ttcacgatct tggecttget agactgctcc 23100 

ttcagcgcgc gctgcccgtt ttcgctcgtc acatccattt caatcaegtg ctccttattt 23160 

ateataatge ttccgtgtag acacttaagc tcgccttcga tctcagcgca gcggtgcagc 23220 

cacaacgcgc agcccgtggg ctcgtgatgc ttgtaggtca cctctgcaaa egactgeagg 23280 

tacgcctgca ggaatcgccc catcatcgtc acaaaggtct tgttgctggt gaaggtcagc 23340 

tgcaacccgc ggtgctcctc gttcagccag gtcttgeata cggccgccag agcttccact 23400 

tggtcaggca gtagtttgaa gttcgccttt agategttat ccacgtggta cttgtccatc 23460 

agcgcgcgcg cagcctccat gcccttctcc cacgcagaca cgatcggcac actcageggg 23520 

ttcatcaccg taatttcact ttccgcttcg ctgggctctt cctcttcctc ttgcgtccgc 23580 

ataccacgcg ccactgggtc gtcttcattc agccgccgca ctgtgcgctt acctcctttg 23640 

ecatgettga ttagcacegg tgggttgctg aaacccacca tttgtagcgc cacatcttct 23700 

ctttcttcct cgctgtccac gattacctct ggtgatggcg ggcgctcggg cttgggagaa 23760 

gggegcttet ttttcttctt gggcgcaatg gccaaatccg ccgccgaggt cgatggccgc 23820 

gggctgggtg tgcgcggcac cagcgcgtct tgtgatgagt cttcctcgtc ctcggactcg 23880 

atacgccgcc tcatccgctt ttttgggggc geceggggag geggeggega eggggaeggg 23940 

gaegacaegt cctccatggt tgggggacgt cgcgccgcac cgcgtccgcg ctegggggtg 24000 

gtttcgeget gctcctcttc ccgactggcc atttccttct cctataggca gaaaaagatc 24060 

atggagtcag tcgagaagaa ggacagecta accgccccct ctgagttcgc caccaccgcc 24120 

tccaccgatg ccgccaacgc gcctaccacc ttccccgtcg aggcaccccc gcttgaggag 24180 

gaggaagtga ttatcgagca ggacccaggt tttgtaagcg aagacgacga ggaccgctca 24240 

gtaccaacag aggataaaaa gcaagaccag gaeaaegcag aggcaaacga ggaacaagtc 24300 

gggcgggggg acgaaaggca tggegactae ctagatgtgg gagacgacgt gctgttgaag 24360 

catctgcagc gccagtgcgc cattatctgc gacgcgttgc aagagcgcag cgatgtgccc 24420 

ctcgccatag cggatgtcag ccttgcctac gaacgccacc tattctcacc gcgcgtaccc 24480 

cccaaacgcc aagaaaaegg cacatgegag cccaacccgc gcctcaactt ctaccccgta 24540 

tttgccgtgc cagaggtget tgccacctat cacatctttt tccaaaactg caagataccc 24600 

ctatcctgcc gtgccaaccg cagccgagcg gacaagcagc tggccttgcg geagggeget 24660 

gtcatacctg atatcgcctc gctcaacgaa gtgccaaaaa tctttgaggg tettggaege 24720 

gacgagaagc gcgcggcaaa cgctctgcaa caggaaaaca gcgaaaatga aagtcactct 24780 

ggagtgttgg tggaactcga gggtgacaac gcgcgcctag ccgtactaaa acgcagcatc 24840 
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gaggtcaccc 
a t gag t gage 
caaacagagg 
cgcgagcctg 
gtggagcttg 
gaaacattgc 
gtggagctct 
aacgtgcttc 
tacttatttc 
gagtgeaace 
gccttcaacg 
cttaaaaccc 
aggaacttta 
gactttgtgc 
ctgeagctag 
ggtctactgg 
aattegcage 
cctgacgaaa 
taccttcgca 
caatcccgcc 
ggccaattgc 
gtttacttgg 
. tatcagcagc 
gccgccgcca 
cgaggaggag 
cgaagaggtg 
gaaateggea 
gcccgttcgc 
gcagccgccg 
gcacaagaac 
ccgctttctt 
tcatctctac 
cacagaagca 
cggcagcagc 
agcttagaaa 
aacaagagct 
acaaaagega 
actgcgcgct 
taegtcatet 
aggaaattcc 
ctgcccaaga 
gggtcaaegg 
ccacacctcg 
gtcccgctcc 
actcaggggc 
taactcacct 
cgcttggtct 
cgcctcgtca 
ttggaactct 
gacctcccgg 
eggaeggcta 
. tccactgtcg 
tgcccgagga 
ttgecegtag 
gaccctgtgt 



actttgecta 

tgatcgtgcg 

agggcctacc 

ccgacttgga 

agtgcatgca 

actacacctt 

gcaacctggt 

attccacgct 

tatgetacac 

tcaaggagct 

agcgctccgt 

tgcaacaggg 

tectagageg 

ccattaagta 

ccaactacct 

agtgtcactg 

tgettaaega 

agtccgcggc 

aatttgtacc 

cgccaaatgc 

aagccatcaa 

acccccagtc 

agccgcgggc 

cccacggacg 

gaggacatga 

tcagacgaaa 

accggttcca 

cgacccaacc 

ccgttagccc 

gccatagttg 

ctctaccatc 

agcccatact 

aaggcgaccg 

aggaggagga 

caggattttt 

gaaaataaaa 

agatcagctt 

gactcttaag 

ccagcggcca 

cacgccctac 

ctactcaacc 

aatccgcgcc 

taataacctt 

caccactgtg 

geagcttgeg 

gacaatcaga 

ccgtccggac 

ggcaatccta 

gcaatttatt 

ccactatccg 

cgactgaatg 

ccgccacaag 

tcatatcgag 

ectgattegg 

tctcactgtg 



cccggcactt 

ccgtgcgcag 

cgcagttggc 

ggagegaege 

gcggttcttt 

tcgacagggc 

ctcctacctt 

caagggegag 

ctggcagacg 

gcagaaactg 

ggccgcgcac 

tctgccagac 

ctcaggaatc 

ccgcgaatgc 

tgcctaccac 

tcgctgcaac 

aagtcaaatt 

tccggggttg 

tgaggactac 

ggagcttacc 

caaagcccgc 

eggegaggag 

ccttgcttcc 

aggaggaata 

tggaagactg 

caccgtcacc 

gcatggctac 

gtagatggga 

aagagcaaca 

ettgettgea 

acggcgtggc 

gcaccggcgg 

gatagcaaga 

gcgctgcgtc 

cccactctgt 

aacaggtctc 

cggcgcacgc 

gactagtttc 

cacccggcgc 

atgtggagtt 

cgaataaact 

caccgaaacc 

aatccccgta 

gtacttccca 

ggeggcttte 

gggcgaggta 

gggacatttc 

actctgeaga 

gaggagtttg 

gatcaattta 

ttaagtggag 

tgctttgccc 

ggcccggcgc 

gagtttaccc 

atttgeaact 



aacctacccc 

cccctggaga 

gacgagcagc 

aaactaatga 

gctgacccgg 

tacgtacgcc 

ggaattttgc 

gcgcgccgcg 

gccatgggcg 

ctaaagcaaa 

ctggcggaca 

ttcaccagtc 

ttgcccgcca 

cctccgccgc 

tctgacataa 

ctatgcaccc 

ateggtaect 

aaactcactc 

cacgcccacg 

gcctgcgtca 

caagagtttc 

ctcaacccaa 

caggatggca 

ctgggacagt 

ggagagecta 

ctcggtcgca 

aacctccgct 

caccactgga 

acagcgccaa 

agactgtggg 

cttcccccgt 

cagcggcagc 

ctctgacaaa 

tggcgcccaa 

atgetatatt 

tgcgatccct 

tggaagaege 

gcgccctttc 

cagcacctgt 

accagccaca 

acatgagege 

gaattctctt 

gttggcccgc 

gagacgccca 

gtcacagggt 

ttcagctcaa 

agateggegg 

cctcgtcctc 

tgccatcggt 

ttcctaactt 

aggcagagca 

gcgactccgg 

acggcgtccg 

agcgccccct 

gtcctaacct 



ccaaggtcat 

gggatgcaaa 

tagegegctg 

tggccgcagt 

agatgeageg 

aggectgeaa 

acgaaaaccg 

actacgtccg 

tttggcagca 

acttgaagga 

tcattttccc 

aaagcatgtt 

cctgctgtgc 

tttggggcca 

tggaagacgt 

cgcaccgctc 

ttgagctgea 

cggggctgtg 

agattaggtt 

ttacccaggg 

tgctacgaaa 

tccccccgcc 

cccaaaaaga 

caggcagagg 

gacgaggaag 

ttcccctcgc 

cctcaggcgc 

accagggccg 

ggctaccgct 

ggcaacatct 

aacatcctgc 

ggcagcaaca 

geccaagaaa 

cgaacccgta 

tcaacagagc 

cacccgcagc 

ggaggctctc 

tcaaatttaa 

cgtcagcgcc 

aatgggactt 

gggaccccac 

ggaacaggcg 

tgccctggtg 

ggccgaagtt 

gcggtcgccc 

cgacgagtcg 

cgccggccgt 

tgagccgcgc 

ctactttaac 

tgacgcggta 

actgcgcctg 

tgagttttgc 

gcttaccgcc 

gctagttgag 

tggattacat 



gagcacagtc 

tttgcaagaa 

gcttcaaacg 

gctcgttacc 

caagctagag 

gatctccaac 

ccttgggcaa 

egactgegtt 

gtgcttggag 

cctatggacg 

cgaacgcctg 

gcagaacttt 

acttcctagc 

ctgctacctt 

gageggtgae 

cctggtttgc 

gggtccctcg 

gaegtegget 

ctracgaagac 

ccacattctt 

gggacggggg 

gccgcagccc 

agetgeaget 

aggttttgga 

cttccgaggt 

cggcgcccca 

cgccggcact 

gtaagtccaa 

catggegegg 

ccttcgcccg 

attactaccg 

gcagcggcca 

tccacagcgg 

tcgaeccgcg 

aggggecaag 

tgcctgtatc 

ttcagtaaat 

gegegaaaac 

attatgagca 

gcggctggag 

atgatatccc 

gctattacca 

taccaggaaa 

cagatgacta 

gggcagggta 

gtgagctcct 

ccttcattca 

tctggaggca 

cccttctcgg 

aaggactegg 

aaacacctgg 

tactttgaat 

cagggagagc 

egggacaggg 

caagatcttt 



24900 

24960. 

25020 

25080 

25140 

25200 

25260 

25320 

25380 

25440 

25500 

25560 

25620 

25680 

25740 

25800 

25860 

25920 

25980 

26040 

26100 

26160 

26220 

26280 

2634Q 

26400 

26460 

26520 

26580 

26640 

26700 

26760 

26820 

26880 

26940 

27000 

27060 

27120 

27180 

27240 

27300 

27360 

27420 

27480 

27540 

27600 

27660 

27720 

27780 

27840 

27900 

27960 

28020 

28080 

28140 
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gttgccatct ctgtgctgag tataataaat acagaaatta aaatatactg gggctcctat 28200 

cgccatcctg taaacgccac cgtcttcacc cgcccaagca aaccaaggcg aaccttacct 28260 

ggtactttta acatctctcc ctctgtgatt tacaacagtt tcaacccaga cggagtgagt 28320 

ctacgagaga acctctccga gctcagctac tccatcagaa aaaacaccac cctccttacc 28380 

tgccgggaac gtacgagtgc gtcaccggcc gctgcaccac acctaccgcc tgaccgtaaa 28440 

ccagactttt tccggacaga cctcaataac tctgtttacc agaacaggag gtgagcttag 28500 

aaaaccctta gggtattagg ccaaaggcgc agctactgtg gggtttatga acaattcaag 28560 

caactctacg ggctattcta attcaggttt ctctagaatc ggggttgggg ttattctctg 28620 

tcttgtgatt ctctttattc ttatactaac gcttctctgc ctaaggctcg ccgcctgctg 28680 

tgtgcacatt tgcatttatt gtcagctttt taaacgctgg ggtcgccacc caagatgatt 28740 

aggtacataa tcctaggttt actcaccctt gcgtcagccc acggtaccac ccaaaaggtg 28800 

gattttaagg agccagcctg taatgttaca ttcgcagctg aagctaatga gtgcaccact 28860 

cttataaaat gcaccacaga acatgaaaag ctgcttattc gccacaaaaa caaaattggc 28920 

aagtatgctg tttatgctat ttggcagcca ggtgacacta cagagtataa tgttacagtt 28980 

ttccagggta aaagtcataa aacttttatg tatacttttc cattttatga aatgtgcgac 29040 

attaccatgt acatgagcaa acagtataag ttgtggcccc cacaaaattg tgtggaaaac 29100 

actggcactt tctgctgcac tgctatgcta attacagtgc tcgctttggt ctgtacccta 29160 . 

ctctatatta aatacaaaag cagacgcagc tttattgagg aaaagaaaat gccttaattt 29220 

actaagttac aaagctaatg tcaccactaa ctgctttact cgctgcttgc aaaacaaatt 29280 

caaaaagtta gcattataat tagaatagga tttaaacccc ccggtcattt cctgctcaat 29340 

accattcccc tgaacaattg actctatgtg ggatatgctc cagcgctaca accttgaagt 29400 

caggcttcct ggatgtcagc atctgacttt ggccagcacc tgtcccgcgg atttgttcca 29460 

gtccaactac agcgacccac cctaacagag atgaccaaca caaccaacgc ggccgccgct 29520 

accggactta catctaccac aaatacaccc caagtttctg cctttgtcaa taactgggat 29580 

aacttgggca tgtggtggtt ctccatagcg cttatgtttg tatgccttat tattatgtgg 29640 

ctcatctgct gcctaaagcg caaacgcgcc cgaccaccca tctatagtcc catcattgtg 29700 

ctacacccaa acaatgatgg aatccataga ttggacggac tgaaacacat gttcttttct 29760 

cttacagtat gattaaatga gacatgattc ctcgagtttt tatattactg acccttgttg 29820 

cgcttttttg tgcgtgctcc acattggctg cggtttctca catcgaagta gactgcattc 29880 

cagccttcac agtctatttg ctttacggat ttgtcaccct cacgctcatc tgcagcctca 29940 

tcactgtggt catcgccttt atccagtgca ttgactgggt ctgtgtgcgc tttgcatatc 30000 

tcagacacca tccccagtac agggacagga ctatagctga gcttcttaga attctttaat 30060 

tatgaaattt actgtgactt ttctgctgat tatttgcacc ctatctgcgt tttgttcccc 30120 
gacctccaag cctcaaagac atatatcatg cagattcact cgtatatgga atattccaag 30180 
ttgctacaat gaaaaaagcg atctttccga agcctggtta tatgcaatca tctctgttat 30240 
ggtgttctgc agtaccatct tagccctagc tatatatccc taccttgaca ttggctggaa 30300 
acgaatagat gccatgaacc acccaacttt ccccgcgccc gctatgcttc cactgcaaca 30360 
agttgttgcc ggcggctttg tcccagccaa tcagcctcgc cccacttctc ccacccccac 30420 
tgaaatcagc tactttaatc taacaggagg agatgactga caccctagat ctagaaatgg 30480 
acggaattat tacagagcag cgcctgctag aaagacgcag ggcagcggcc gagcaacagc 30540 
gcatgaatca agagctccaa gacatggtta acttgcacca gtgcaaaagg ggtatctttt 30600 
gtctggtaaa gcaggccaaa gtcacctacg acagtaatac caccggacac cgccttagct 30660 
acaagttgcc aaccaagcgt cagaaattgg tggtcatggt gggagaaaag cccattacca 30720 
taactcagca ctcggtagaa accgaaggct gcattcactc accttgtcaa ggacctgagg 30780 
atctctgcac ccttattaag accctgtgcg gtctcaaaga tcttattccc tttaactaat 30840 
aaaaaaaaat aataaagcat cacttactta aaatcagtta gcaaatttct gtccagttta 30900 
ttcagcagca cctccttgcc ctcctcccag ctctggtatt gcagcttcct cctggctgca 30960 
aactttctcc acaatctaaa tggaatgtca gtttcctcct gttcctgtcc atccgcaccc 31020 
actatcttca tgttgttgca gatgaagcgc gcaagaccgt ctgaagatac cttcaacccc 31080 
gtgtatccat atgacacgga aaccggtcct ccaactgtgc cttttcttac tcctcccttt 31140 
gtatccccca atgggtttca agagagtccc cctggggtac tctctttgcg cctatccgaa 31200 
cctctagtta cctccaatgg catgcttgcg ctcaaaatgg gcaacggcct ctctctggac 31260 
gaggccggca accttacctc ccaaaatgta accactgtga gcccacctct caaaaaaacc 31320 
aagtcaaaca taaacctgga aatatctgca cccctcacag ttacctcaga agccctaact 31380 
gtggctgccg ccgcacctct aatggtcgcg ggcaacacac tcaccatgca atcacaggcc 31440 
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ccgctaaccg tgcacgactc caaacttagc attgccaccc aaggacccct cacagtgtca 31500 

gaaggaaagc tagccctgca aacatcaggc cccctcacca ccaccgatag cagtaccctt 31560 

actatcactg cctcaccccc tctaactact gccactggta gcttgggcat tgacttgaaa 31620 

gagcccattt atacacaaaa tggaaaacta ggactaaagt acggggctcc tttgcatgta 31680 

acagacgacc taaacacttt gaccgtagca actggtccag gtgtgactat taataatact 31740 

tccttgcaaa ctaaagttac tggagccttg ggttttgatt cacaaggcaa tatgcaactt 31800 

aatgtagcag gaggactaag gattgattct caaaacagac gccttatact tgatgttagt 31860 

tatccgtttg atgctcaaaa ccaactaaat ctaagactag gacagggccc tctttttata 31920 

aactcagccc acaacttgga tattaactac aacaaaggcc tttacttgtt tacagcttca 31980 

aacaattcca aaaagcttga ggttaaccta agcactgcca aggggttgat gtttgacgct 32040 

acagccatag ccattaatgc aggagatggg cttgaatttg gttcacctaa tgcaccaaac 32100 

acaaatcccc tcaaaacaaa aattggccat ggcctagaat ttgattcaaa caaggctatg 32160 

gttcctaaac taggaactgg ccttagtttt gacagcacag gtgccattac agtaggaaac 32220 

aaaaataatg ataagctaac tttgtggacc acaccagctc catctcctaa ctgtagacta 32280. 

aatgcagaga aagatgctaa actcactttg gtcttaacaa aatgtggcag tcaaatactt 32340 

gctacagttt cagttttggc tgttaaaggc agtttggctc caatatctgg aacagttcaa 32400 

agtgctcatc ttattataag atttgacgaa aatggagtgc tactaaacaa ttccttcctg 32460 

gacccagaat attggaactt tagaaatgga gatcttactg aaggcacagc ctatacaaac 32520 

gctgttggat ttatgcctaa cctatcagct tatccaaaat ctcacggtaa aactgccaaa 32580 

agtaacattg tcagtcaagt ttacttaaac ggagacaaaa ctaaacctgt aacactaacc 32640 

attacactaa acggtacaca ggaaacagga gacacaactc caagtgcata ctctatgtca 32700 

ttttcatggg actggtctgg ccacaactac attaatgaaa tatttgccac atcctcttac 32760 

actttttcat acattgccca agaataaaga atcgtttgtg ttatgtttca acgtgtttat 32820 

ttttcaattg cagaaaattt caagtcattt ttcattcagt agtatagccc caccaccaca 32880 

tagcttatac agatcaccgt accttaatca aactcacaga accctagtat tcaacctgcc 32940 

acctccctcc caacacacag agtacacagt cctttctccc cggctggcct taaaaagcat 33000 

catatcatgg gtaacagaca tattcttagg tgttatattc cacacggttt cctgtcgagc 33060 

caaacgctca tcagtgatat taataaactc cccgggcagc tcacttaagt tcatgtcgct 33120 

gtccagctgc tgagccacag gctgctgtcc aacttgcggt tgcttaacgg gcggcgaagg 33i80 

agaagtccac gcctacatgg gggtagagtc ataatcgtgc atcaggatag ggcggtggtg 33240 

ctgcagcagc gcgcgaataa actgctgccg ccgccgctcc gtcctgcagg aatacaacat 33300 

ggcagtggtc tcctcagcga tgattcgcac cgcccgcagc ataaggcgcc ttgtcctccg 33360 

ggcacagcag cgcaccctga tctcacttaa atcagcacag taactgcagc acagcaccac 33420 

aatattgttc aaaatcccac agtgcaaggc gctgtatcca aagctcatgg cggggaccac 33480 

agaacccacg tggccatcat accacaagcg caggtagatt aagtggcgac ccctcataaa 33540 

cacgctggac ataaacatta cctcttttgg catgttgtaa ttcaccacct cccggtacca 33600 

tataaacctc tgattaaaca tggcgccatc caccaccatc ctaaaccagc tggccaaaac 33660 

ctgcccgccg gctatacact gcagggaacc gggactggaa caatgacagt ggagagccca 33720 

ggactcgtaa ccatggatca tcatgctcgt catgatatca atgttggcac aacacaggca 33780 

cacgtgcata cacttcctca ggattacaag ctcctcccgc gttagaacca tatcccaggg 33840 

aacaacccat tcctgaatca gcgtaaatcc cacactgcag ggaagacctc gcacgtaact 33900 

cacgttgtgc attgtcaaag tgttacattc gggcagcagc ggatgatcct ccagtatggt 33960 

agcgcgggtt tctgtctcaa aaggaggtag acgatcccta ctgtacggag tgcgccgaga 34020 

caaccgagat cgtgttggtc gtagtgtcat gccaaatgga acgccggacg tagtcatatt 34080 

tcctgaagca aaaccaggtg cgggcgtgac aaacagatct gcgtctccgg tctcgccgct 34140 

tagatcgctc tgtgtagtag ttgtagtata tccactctct caaagcatcc aggcgccccc 34200 

tggcttcggg ttctatgtaa actccttcat gcgccgctgc cctgataaca tccaccaccg 34260 

cagaataagc cacacccagc caacctacac attcgttctg cgagtcacac acgggaggag 34320 

cgggaagagc tggaagaacc atgttttttt ttttattcca aaagattatc caaaacctca 34380 

aaatgaagat ctattaagtg aacgcgctcc cctccggtgg cgtggtcaaa ctctacagcc 34440 

aaagaacaga taatggcatt tgtaagatgt tgcacaatgg cttccaaaag gcaaacggcc 34500 

ctcacgtcca agtggacgta aaggctaaac ccttcagggt gaatctcctc tataaacatt 34560 

ccagcacctt caaccatgcc caaataattc tcatctcgcc accttctcaa tatatctcta 34620 

agcaaatccc gaatattaag tccggccatt gtaaaaatct gctccagagc gccctccacc 34680 

ttcagcctca agcagcgaat catgattgca aaaattcagg ttcctcacag acctgtataa 34740 
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gattcaaaag 
gctgaacata 
tgacaaaaga 
ccccgatgta 
tcaggcaaag 
caggtaagct 
ggtttctgca 
ttacaacagg 
cgtaaaaaaa 
gagtcataat 
cgaccgaaac 
ataggaggta 
tgcctaggca 
cctaacagtc 
gcaccagctc 
actaaaaaat 
ctacgcccag 
cccacgttac 
ccgccctaaa 
accccctcat 



cggaacatta 
atcgtgcagg 
acccacactg 
agctttgttg 
cctcgcgcaa 
ccggaaccac 
taaacacaaa 
aaaaacaacc 
ctggtcaccg 
gtaagactcg 
agcccggggg 
taacaaaatt 
aaatagcacc 
agccttacca 
aatcagtcac 
gacgtaacgg 
aaacgaaagc 
gtaacttccc 
acctacgtca 
tatcatattg 



acaaaaatac 
tctgcacgga 
attatgacac 
catgggcggc 
aaaagaaagc 
cacagaaaaa 
ataaaataac 
cttataagca 
tgattaaaaa 
gtaaacacat 
aatacatacc 
aataggagag 
ctcccgctcc 
gtaaaaaaga 
agtgtaaaaa 
ttaaagtcca 
caaaaaaccc 
attttaagaa 
cccgccccgt 
gcttcaatcc 



cgcgatcccg 
ccagcgcggc 
gcatactcgg 
gatataaaat 
acatcgtagt 
gacaccattt 
aaaaaaacat 
taagacggac 
gcaccaccga 
caggttgatt 
cgcaggcgta 
aaaaacacat 
agaacaacat 
aaacctatta 
agggccaagt 
caaaaaacac 
acaacttcct 
aactacaatt 
tcccacgccc 
aaaataaggt 



taggtccctt 
cacttccccg 
agctatgcta 
gcaaggtgct 
catgctcatg 
ttctctcaaa 
ttaaacatta 
tacggccatg 
cagctcctcg 
catcggtcag 
gagacaacat 
aaacacctga 
acagcgcttc 
aaaaaacacc 
gcagagcgag 
ccagaaaacc 
caaatcgtca 
cccaacacat 
cgcgccacgt 
atattattga 



cgcagggcca 
ccaggaacct 
accagcgtag 
gctcaaaaaa 
cagataaagg 
catgtctgcg 
gaagcctgtc 
ccggcgtgac 
gtcatgtccg 
tgctaaaaag 
tacagccccc 
aaaaccctcc 
acagcggcag 
actcgacacg 
tatatatagg 
gcacgcgaac 
cttccgtttt 
acaagttact 
cacaaactcc 
tgatg 



<210> 10 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NSsuboptmut 



34800 
34860 
34920 
34980 
35040 
35100 
35160 
35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35935 



<400> 10 

gccaccatgg 

atcaccagcc 

accgctaccc 

ggagccggaa 

gtggatcagg 

acctgtggaa 

cgcaggggcg 

agcagcggag 

gtgtgtacca 

accatgcgca 

caggtggctc 

tacgccgctc 

ttcggcgctt 

accatcacca 

ggctgcagcg 

accaccatcc 

gtggtgctgg 

gaggtggccc 

gccatccgcg 

gctgccaagc 

tcagtgatcc 

tacaccggag 

ttcagcctgg 

aggagccaga 

cctggcgaaa 



cccccatcac 
tgaccggacg 
agagcttcct 
gcaagaccct 
atctggtggg 
gcagcgacct 
attctcgcgg 
gacccctgct 

ggggcgtggc 

gccctgtgtt 
acctgcacgc 
agggctacaa 
acatgagcaa 
ccggagctcc 
gaggagccta 
tgggcattgg 
ccacagctac 
tgagcaacac 
gaggcaggca 
tgagcggact 
ccaccatcgg 
acttcgacag 
accccacctt 
ggcgcggacg 
ggccctctgg 



cgcctacagc 
cgacaagaac 
ggccacctgc 
ggccggaccc 
ctggcaggcc 
gtacctggtg 
aagcctgctg 
gtgtccttct 
caaagccgtg 
caccgacaac 
ccctaccgga 
ggtgctggtg 
ggcccatggc 
cgtgacctac 
cgacatcatc 
caccgtgctg 
ccctcctggc 
aggcgagatc 
cctgatcttc 
gggcatcaac 
cgatgtggtg 
cgtgatcgac 
caccatcgaa 
caccggaagg 
catgttcgac 



cagcagacca 
caggtggagg 
gtgaacggcg 
aagggcccta 
cctcccggag 
acacgccacg 
agccctaggc 
ggccatgccg 
gattttgtgc 
agctctcccc 
tctggcaaga 
ctgaacccca 
atcgacccca 
agcacctacg 
atctgcgacg 
gatcaggccg 
agcgtgaccg 
cccttctacg 
tgccacagca 
gccgtggcct 
gtggtggcca 
tgcaacacct 
accaccaccg 
ggcaggcgcg 
agcagcgtgc 



ggggcctgct 
gagaggtgca 
tgtgctggac 
tcacccagat 
ccaggagcct 
ccgatgtgat 
ccgtgagcta 
tgggcatttt 
ccgtggaaag 
ctgccgtgcc 
gcaccaaggt 
gcgtggccgc 
acatccgcac 
gcaagttcct 
agtgccacag 
aaacagctgg 
tgccccatcc 
gcaaggccat 
agaagaagtg 
actacagggg 
ccgacgccct 
gcgtgaccca 
tgcctcagga 
gaatttatcg 
tgtgcgagtg 



gggctgcatc 
ggtggtgagc 
cgtgtaccac 
gtacaccaat 
gacaccctgt 
ccccgtgagg 
cctgaagggc 
tcgcgctgcc 
catggagacc 
ccaatcattc 
gcccgctgcc 
taccctgggc 
aggcgtgcgc 
ggccgatgga 
caccgacagc 
agccaggctg 
caatatcgag 
ccccatcgag 
cgacgagctg 
cctggacgtg 
gatgacaggc 
gaccgtggac 
tgctgtgagc 
ctttgtgacc 
ctacgacgct 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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ggctgcgctt ggtacgagct gacacccgct gaaaccagcg tgcgcctgcg cgcttatctg 1560 

aatacccctg gcctgcccgt gtgtcaggac cacctggagt tctgggagag cgtgttcaca 1620 

ggactgaccc acatcgacgc ccatttcctg agccagacca agcaggctgg cgacaacttc 1680 

ccctatctgg tggcctatca ggccaccgtg tgtgctaggg cccaagctcc acctccttca 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccctacccct 1800 

ctgctgtacc gcctgggagc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgctgatctg gaagtggtga ccagcacctg ggtgctggtg 1920 

ggaggcgtgc tggccgctct ggctgcctac tgcctgacca ccggaagcgt ggtgatcgtg 1980 

ggacgcatca tcctgagcgg aaggcccgct atcgtgcccg atcgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgtgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgaac agttcaagca gaaggccctg ggcctgctgc agacagccac caaacaggcc 2160 

\gaagctgccg ctcccgtggt ggaaagcaag tggagggccc tggagacctt ctgggctaag 2220 

cacatgtgga acttcatctc tggcatccag tacctggccg gactgagcac cctgcctggc 2280 

aaccccgcta tcgccagcct gatggccttc accgctagca tcacctctcc cctgaccacc 2340 

cagagcaccc tgctgttcaa cattctgggc ggatgggtgg ccgctcagct ggcccctcct 2400 

tcagctgctt ctgcctttgt gggcgctggc attgccggag ccgctgtggg cagcattggc 2460 

ctgggcaaag tgctggtgga tattctggct ggctatggcg ctggcgtggc cggagccctg 2520 

gtggccttca aggtgatgag cggagagatg cccagcaccg aggacctggt gaacctgctg 2580 

cctgccattc tgagccctgg agccctggtg gtgggcgtgg tgtgtgctgc cattctgagg 2 640 

cgccatgtgg gacccggaga gggcgctgtg cagtggatga accgcctgat cgccttcgcc 2700 

tctcgcggaa accacgtgag ccctacccac tacgtgcctg agagcgacgc cgctgccagg 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac accctgcagc ggaagctggc tgagggacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccaactg 2940 

cctggcgtgc ccttcttctc atgccagcgc ggatacaagg gcgtgtggag gggcgatggc 3000 

atcatgcaga ccacctgtcc ctgcggagcc cagatcacag gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccctaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggaccctg cacacccagc cctgctccca actacagcag ggccctgtgg 3180 

agggtggctg ccgaggagta cgtggaggtg accagggtgg gagacttcca ctacgtgacc 3240 

ggaatgacca ccgacaacgt gaagtgtccc tgtcaggtgc ccgctcccga attttttacc 3300 

gaagtggatg gcgtgcgcct gcatcgctat gcccctgcct gtaggcccct gctgcgcgaa 3360 

gaagtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cctgagcccg atgtggccgt gctgaccagc atgctgaccg accccagcca catcacagcc 3480 

gaaaccgcta aaaggcgcct ggccaggggc tctcctccaa gcctggcctc aagcagcgct 3540 

agccagctgt ctgctcccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt gcccgccgag atcctgegca agagcaagaa gttccccgct 3780 

gccatgccca tctgggctag acctgattac aaccctcccc tgctggagag ctggaaggac 3840 

cctgattacg tgcctccagt ggtgcatggc tgtcctctgc ctcccattaa agcccctcct 3900 

attccacctc ctaggcgcaa aaggaccgtg gtgctgacag aaagcagcgt gagctctgct 3960 

ctggccgaac tggccaccaa gacctttggc agcagcgaga gctctgccgt ggacagcgga 4020 

acagccaccg ctctgcctga ccaggccagc gacgacggcg ataagggcag cgatgtggag 4080 

agctatagca gcatgcctcc cctggaaggc gaacctggcg atcccgatct gagcgatggc 4140 

agctggagca ccgtgagcga agaggccagc gaggacgtgg tgtgttgcag catgagctac 4200 

acctggacag gcgctctgat cacaccctgc gctgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gaggcaccac aacatggtgt acgccaccac cagcaggtct 4320 

gccggactga ggcagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgatgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgctc gcctgatcgt gttccccgat 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgcctcag 4740. 

gfcggtgatgg gctcaagcta cggcttccag tacagccctg gccagcgcgt ggagttcctg 4800 
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gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac acgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 4920 

ctggcccctg aggccaggca ggccatcaag agcctgaccg agcgcctgta catcggaggc 4980 

cctctgacca acagcaaggg acagaactgc ggatacaggc gctgtagggc ctctggcgtg 5040 

ctgaccacca gctgtggcaa caccctgacc tgctacctga aggccagcgc tgcctgtcgc 5100 

gctgccaagc tgcaggactg caccatgctg gtgaacgccg ctggcctggt ggtgatttgt 5160 

gaaagcgctg gcacccagga agatgctgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

aggtactctg cccctcccgg agacccccct cagcccgaat acgacctgga gctgatcacc 5280 

agctgctcaa gcaacgtgag cgtggctcac gacgccagcg gaaagcgcgt gtactacctg 5340 

acacgcgatc ccaccacccc tctggctcgc gctgcctggg aaaccgctcg ccatacaccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccta ccctgtgggc tcgcatgatc 5460 

ctgatgaccc acttcttcag catcctgctg gctcaggagc agctggagaa ggccctggac 5520 

tgccagattt acggcgcttg ctacagcatc gagcccctgg acctgcccca aatcatcgag 5580 

cgcctgcacg gcctgtctgc cttcagcctg cacagctaca gccctggcga aattaatcgc 5640 

gtggccagct gtctgcgcaa actgggcgtg cctcctctgc gcgtgtggag gcatagggct 5700 

aggagcgtga gggccaggct gctgagccag ggaggcaggg ccgctacctg tggaaagtac 5760 

ctgttcaact gggccgtgaa gaccaagctg aagctgaccc ctatccctgc cgctagccag 582 0 

ctggacctga gcggatggtt cgtggctggc tacagcggag gcgacatcta ccacagcctg 5880 

tctcgcgctc gccctcgctg gttcatgctg tgcctgctgc tgctgagcgt gggcgtgggc 5940 

afcctacctgc tgcccaaccg ctaaa ~" 5965 

<210> 11 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Chimeric NSsuboptmut 
<400> 11 

gccaccatgg cccccatcac cgcctacagc cagcagaccc gcggcctgct gggctgcatc 60 

atcaccagcc tgaccggccg cgacaagaac caggtggagg gcgaggtgca ggtggtgagc 120 

accgccaccc agagcttcct ggccacctgc gtgaacggcg tgtgctggac cgtgtaccac 180 

ggcgccggca gcaagaccct ggccggcccc aagggcccca tcacccagat gtacaccaac 240 

gtggaccagg acctggtggg ctggcaggcc ccccccggcg cccgcagcct gaccccctgc 300 

acctgcggca gcagcgacct gtacctggtg acccgccacg ccgacgtgat ccccgtgcgc 360 

cgccgcggcg acagccgcgg cagcctgctg agcccccgcc ccgtgagcta cctgaagggc 420 

agcagcggcg gccccctgct gtgccccagc ggccacgccg tgggcatctt ccgcgccgcc 480 

gtgtgcaccc gcggcgtggc caaggccgtg gacttcgtgc ccgtggagag catggagacc 540 

accatgcgca gccccgtgtt caccgacaac agcagccccc ccgccgtgcc ccagagcttc 600 

caggtggccc acctgcacgc ccccaccggc agcggcaaga gcaccaaggt gcccgccgcc 660 

tacgccgccc agggctacaa ggtgctggtg ctgaacccca gcgtggccgc caccctgggc 720 

ttcggcgcct acatgagcaa ggcccacggc atcgacccca acatccgcac cggcgtgcgc 780 

accatcacca ccggcgcccc cgtgacctac agcacctacg gcaagttcct ggccgacggc 840 

ggctgcagcg gcggcgccta cgacatcatc atctgcgacg agtgccacag caccgacagc 900 

accaccatcc tgggcatcgg caccgtgctg gaccaggccg agaccgccgg cgcccgcctg 960 

gtggtgctgg ccaccgccac cccccccggc agcgtgaccg tgccccaccc caacatcgag 1020 

gaggtggccc tgagcaacac cggcgagatc cccttctacg gcaaggccat ccccatcgag 1080 

gccatccgcg gcggccgcca cctgatcttc tgccacagca agaagaagtg cgacgagctg 1140 

gccgccaagc tgagcggcct gggcatcaac gccgtggcct actaccgcgg cctggacgtg 1200 

agcgtgatcc ccaccatcgg cgacgtggtg gtggtggcca ccgacgccct gatgaccggc 1260 

tacaccggcg acttcgacag cgtgatcgac tgcaacacct gcgtgaccca gaccgtggac 1320 

ttcagcctgg accccacctt caccatcgag accaccaccg tgccccagga cgccgtgagc 1380 

cgcagccagc gccgcggccg caccggccgc ggccgccgcg gcatctaccg cttcgtgacc 1440 

cccggcgagc gccccagcgg catgttcgac agcagcgtgc tgtgcgagtg ctacgacgcc 1500 
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ggctgcgcct ggtacgagct gacccccgcc gagaccagcg tgcgcctgcg cgcctacctg X560 

aacacccccg gcctgcccgt gtgccaggac cacctggagt tctgggagag cgtgttcacc 1620 
ggcctgaccc acatcgacgc ccacttcctg agccagacca agcaggccgg cgacaacttc . 1680 

ccctacctgg tggcctacca ggccaccgtg tgcgcccgcg cccaggcccc cccccccagc 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccccaccccc 1800 

ctgctgtacc gcctgggcgc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgccgacctg gaggtggtga ccagcacctg ggtgctggtg 1920 

ggcggcgtgc tggccgccct ggccgcctac tgcctgacca ccggcagcgt ggtgatcgtg 1980 

ggccgcatca tcctgagcgg ccgccccgcc atcgtgcccg accgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgcgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgagc agttcaagca gaaggccctg ggcctgctgc agaccgccac caagcaggcc 2160 

gaggccgccg cccccgtggt ggagagcaag tggcgcgccc tggagacctt ctgggccaag 2220 

cacatgtgga acttcatcag cggcatccag tacctggccg gcctgagcac cctgcccggc 2280. 

aaccccgcca tcgccagcct gatggccttc accgccagca tcaccagccc cctgaccacc 2340 

cagagcaccc tgctgttcaa catcctgggc ggctgggtgg ccgcccagct ggcccccccc 2400 

agcgccgcca gcgccttcgt gggcgccggc atcgccggcg ccgccgtggg cagcatcggc 2460 

ctgggcaagg tgctggtgga catcctggcc ggctacggcg ccggcgtggc cggcgccctg 2520 

gtggccttca aggtgatgag cggcgagatg cccagcaccg aggacctggt gaacctgctg 2580 

cccgccatcc tgagccccgg cgccctggtg gtgggcgtgg tgtgcgccgc catcctgcgc 2640 

cgccacgtgg gccccggcga gggcgccgtg cagtggatga accgcctgat cgccttcgcc 2700 

agccgcggca accacgtgag ccccacccac tacgtgcccg agagcgacgc cgccgcccgc 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac cccctgcagc ggcagctggc tgcgcgacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccagctg 2940 

cccggcgtgc ccttcttcag ctgccagcgc ggctacaagg gcgtgtggcg cggcgacggc 3000 

atcatgcaga ccacctgccc ctgcggcgcc cagatcaccg gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccccaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggcccctg cacccccagc cccgccccca actacagccg cgccctgtgg 3180 

cgcgtggccg ccgaggagta cgtggaggtg acccgcgtgg gcgacttcca ctacgtgacc 3240 

ggcatgacca ccgacaacgt gaagtgcccc tgccaggtgc ccgcccccga gttcttcacc 3300 

gaggtggacg gcgtgcgcct gcaccgctac gcccccgcct gccgccccct gctgcgcgag 3360 

gaggtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cccgagcccg acgtggccgt gctgaccagc atgctgaccg accccagcca catcaccgcc 3480 

gagaccgcca agcgccgcct ggcccgcggc agccccccca gcctggccag cagcagcgcc 3540 

agccagctga gcgcccccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt geccgccgag atcctgcgca agagcaagaa gttccccgct 3780 

gccatgccca tctgggctag acctgattac aaccctcccc tgctggagag ctggaaggac 3840 

cctgattacg tgcctccagt ggtgcatggc tgtcctctgc ctcccattaa agcccctcct 3900 

attccacctc ctaggcgcaa aaggaccgtg gtgctgacag aaagcagcgt gagctctgct 3960 

ctggccgaac tggccaccaa gacctttggc agcagcgaga gctctgccgt ggacagcgga 4020 

acagccaccg ctctgcctga ccaggccagc gacgacggcg ataagggcag cgatgtggag 4080 

agctatagca gcatgcctcc cctggaaggc gaacctggcg atcccgatct gagcgatggc 4140 

agctggagca ccgtgagcga agaggccagc gaggacgtgg tgtgttgcag catgagctac 4200 

acctggacag gcgctctgat cacaccctgc gctgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gaggcaccac aacatggtgt acgccaccac cagcaggtct 4320 

gccggactga ggcagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgatgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgccc gcctgatcgt gttccccgac 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgccccag 4740 

gtggtgatgg gcagcagcta cggcttccag tacagccccg gccagcgcgt ggagttcctg 4800 
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gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac ccgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 4920 

ctggcccccg aggcccgcca ggccatcaag agcctgaccg agcgcctgta catcggcggc 4980 

cccctgacca acagcaaggg ccagaactgc ggctaccgcc gctgccgcgc cagcggcgtg 5040 

ctgaccacca gctgcggcaa caccctgacc tgctacctga aggccagcgc cgcctgccgc 5100 

gccgccaagc tgcaggactg caccatgctg gtgaacgccg ccggcctggt ggtgatctgc 5160 

gagagcgccg gcacccagga ggacgccgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

cgctacagcg ccccccccgg cgaccccccc cagcccgagt acgacctgga gctgatcacc 5280 

agctgcagca gcaacgtgag cgtggcccac gacgccagcg gcaagcgcgt gtactacctg 5340 

acccgcgacc ccaccacccc cctggcccgc gccgcctggg agaccgcccg ccacaccccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccca ccctgtgggc ccgcatgatc 5460 

ctgatgaccc acttcttcag catcctgctg gcccaggagc agctggagaa ggccctggac 5520 

tgccagatct acggcgcctg ctacagcatc gagcccctgg acctgcccca gatcatcgag 5580 

cgcctgcacg gcctgagcgc cttcagcctg cacagctaca gccccggcga gatcaaccgc 5640 

gtggccagct gcctgcgcaa gctgggcgtg ccccccctgc gcgtgtggcg ccaccgcgcc 5700 

cgcagcgtgc gcgcccgcct gctgagccag ggcggccgcg ccgccacctg cggcaagtac 5760 

ctgttcaact gggccgtgaa gaccaagctg aagctgaccc ccatccccgc cgccagccag 5820 

ctggacctga gcggctggtt cgtggccggc tacagcggcg gcgacatcta ccacagcctg 5880 

agccgcgccc gcccccgctg gttcatgctg tgcctgctgc tgctgagcgt gggcgtgggc 5940 

atctacctgc tgcccaaccg ctaaa * « y yys 9tyggc ^ 1 ' 



5965 

<210> 12 
<211> 10 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Ribosome binding site 
<400> 12 

gccaccaugg 1Q 

<210> 13 
<211> 49 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic polyadenylation signal 
<400> 13 

aauaaaagau cuuuauuuuc auuagaucug uguguugguu uuuugugug 49 

<210> 14 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Additional nucleotides present in pVUns-NS 
<400> 14 

tctagagcgt ttaaaccctt aattaagg 



28 



<210> 15 
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<211> 15 
<2 i2 > : DNA 

<213> Artificial Sequence 



<220> 

<223> Additional nucleotides present in pVUris-NSOPTmut 



<400> 15 
tttaaatgtt taaac 



15 



<210> 16 
<211> 24 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Oligonucleotide primer 



<400> 16 

tcgaatcgat acgcgaacct acgc 



2.4. 



<210> 17 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> . 

<223> Oligonucleotide primer 
<400> 17 

tcgacgtgtc gacttcgaag cgcacaccaa aaacgtc 



37 
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